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ABSTRACT 

This dissertation proposes a design for a machine structure which is ap- 
propriate for APL and which evaluates programs in this language efficientlyo 

The approach taken is to study the semantics of APL operators and data 
structures rigorously and analytically. We exhihit a compactly representable 
standard form for select expressions, which are composed of operators which 
alter the size and ordering of array structures. In addition, we present a set 
of transformations sufficient to derive the eqxiivalent standard form for any 
select expression. The standard form and transformations are then extended 
to include expressions containing other APL operators^ 

By applying the standard form transformations to storage access fxmctions 
for arrays, select expressions in the machine can be evaluated without having 
to mampulate array values; this process is called beatingo Drag-along is a 
second fundamental process which defers operations on array expressions, 
making possible simplifications of entire expressions through beating and also 
leading to more efficient evaluations of array expressions containing several 
operations. 

The APL machine consists of two separate sub-machines sharing the same 
memory and registers. The D-machine applies beating and drag-along to defer 
simplified programs which the E-machine then evaluates. The major machine 
registers are stacks, and programs are organized into logical segments. 

The performance of the entire APL machine is evaluated analytically by 
comparing it to a h5^othetical naive machine based upon presently-available 
implementations for the language. For a variety of problems examined, the 
APL machine is the more efficient of the two in that it uses fewer memory 
accesses, arithmetic operations, and temporary stores; for some examples, 
the factor of improvement is proportional to the size of array operands. 
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CHAPTER I 
INTRODUCTION 

an optimist is a guy that has never 
had much experience 

Don Marquis, archy and mehitabel 

The electronic digital computer has progressed from being a dream, to an 
esoteric curiosity, to its present pervasive and indispensable role in modern 
societyo Over the years, man's uses of computers have become increasingly 
sophisticatedo Of particular importance is the use of high- level programming 
languages which have made machines more accessible to problem- solver So 

In general, the use of problem- oriented programming languages requires a 
relatively complex translation process in order to present them to machines. 
Although this can be done automatically by compilers, there is a wide gap to 
bridge between the highly- structured concepts in a programming language such 
as ALGOL, PL/l, or APL and the relatively atomic regime of today's compute rSo 
In effect, there exists a mismatch between the kinds of tasks we want to present 
to machines and the machines themselves. One possible way to eliminate this 
difference is to investigate ways of structuring machines to bring them closer 
to the kinds of problems people wish to solve with themo 

Ao A Programming Language 

A particular programming language in which this mismatch with contemporary 
machines is especially obvious is APL, based on the work of Ko Eo Iverson 
(Iverson [1962] )„ APL is a concise, highly mathematical programming language 
designed to deal with array- structured data. APL programs generally contain 
expressions with arrays as operands and which evaluate to arrays, while most 
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other languages require that array manipulations be expressed element- by- element o 
To complement its use of arrays as operands, APL is rich in operators which 
facilitate array calculations. Also, it is highly consistent internally both syntac- 
tically and semantically, and hence could be called ^^mathematical^^ Because of 
its use of structured data and its set of primitives which are quite different from 
those of a classical digital computer, APli does not fit well onto ordinary machines. 
It is possible to do so, and interpreters have been written for at least three dif- 
ferent machines (Abrams [l966]; Berry [1968]; Pakin [1968]). Finally, because 
of its mathematical properties, it is possible to discuss the semantics of the 
language rigorously and to derive significant formal results about expressions in 
the language. 

Bo The Problem 

The problem considered in this dissertation is to design a machine structure 
which is appropriate to APL. ^^Machine structure^* here means a general func- 
tional scheme and not a detailed logical design. The expected result is not a set 
of specifications from which a circuit designer could produce a working device, 
but rather a superstructure into which the features of the language fit cleanlyo 
Thus, this design must in some sense be natural for the language. For example, 
the primitive operations and data structures should include those of APL. In 
addition, the machine should take advantage of all available information in order 
to execute programs as efficiently as possible. We use the word ''machine*^ in 
a very broad sense: what it really means here is "algorithm" and not necessarily 
any particular physical device. Such a machine could be implemented as a con- 
ventional program or as a hardwired device or as a microprogram in an appropriate 
system. For the purposes of this work, it doesn't really matter. 
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'*APL" means any programming language which includes the semantics of 
APL\360 (Pakin [1968] )« We shall not be concerned with the particular syntax 
of APL, although this currently appears to be the best way to represent the 
semantic ideas of the language. In short, the machine should be able to handle 
array- structured data with ease and should be able to evaluate functions on such 
data using the operators of APL as basic primitive So 

The approach taken is to invest a considerable amoxmt of effort in the analysis 
of the mathematical properties of the operators and data structures of APL and 
to exploit these residts in the design of the machinco Thus, a major part of this 
work wiU be dedicated to a rigorous, mathematical investigation of APL expres- 
sionSo This study is contained in Chapter II. In Chapter III, the work of Chapter 
n is related to the design of a machine, and the design goals are set forth in 
detailo Chapter IV discusses the proposed machine design, and Chapter V is an 
evaluation of the machine with respect to the goals of Chapter nio 

It should be emphasized that the goal of designing an APL machine is a rather 
broad onCo Although there are clearly practical applications of such a design, 
that is not the major focus of this work. Rather, we hope that by investigating 
this language and machine in detail, it will be possible to learn something about 
the basic processes in computing and find ways of reflecting these processes in 
a machine structure. The restdts summarized in Chapter VI and the new research 
problems suggested by this work indicate that this goal has been fulfilled. 

Co Historical Perspective 

For the purposes of this dissertation, we are primarily interested in previous 
work in the area of language- directed machine design (McKeeman^967]; Barton [1965]). 
To some extent, all machine design can be considered to be language- directed, in 
that one wishes to implement some particular (machine) language in a piece of 
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hardwareo However, let us consider only the class of machines which might 
better be called "higher language inspired"; that is, machines which are based 
in some way on languages capable of expressing concepts at a higher level than 
are normally associated with assembly code. 

The first such machine was reported in 1954, and was a relay device capable 
of directly evaluating logical expressions (Burks, Warren, and Wright [l954]). 
In addition, this machine used input in parenthesis-free (Polish) notation, thus 
doubling its historical interesto The logic machine typifies one major class of 
language-inspired machine designs in that its machine language is identical to the 
high-level source language. The other major class of language- inspired designs 
is more concerned with the processing of the semantics of the source language, 
rather than direct acceptance of the exact language by the machine. In fact, most 
designs fall between the two extremes, as even those which accept the source 
language directly do some preliminary transformations on it to produce a simpler 
intermediate language. 

Other language-inspired machines accepting source language directly include 
an ALGOL 60 machine (Anderson [l96l]), two FORTRAN machines (Bashkow, 
Sasson and Kronfeld [1967]; Melbourne and Pugmire [1965]), the ADAM machine, 
based on a special symbol- oriented language (Mullery, Schauer and Rice [1963]; 
Meggitt [1964]), and a machine for EULER, a generalization of ALGOL (Weber 
[l967])o Of these devices, some were to be implemented in hardware (e.g. , 
Bashkow et aL ; Mullery et aJU) while others were implemented in microprogram 
(Meggitt; Weber). 

Machines which are more concerned with semantic processing to the extent 
that their machine languages are significantly different from a higher- level 
language include the Burroughs B5000 (Barton [l96l]; Burroughs [1963]) which is 



essentially an ALGOL machine, a PL/l machine (Sugimoto [1969]) and the Rice 
University computer (Eiffe and Jadeit [1962]). Current work in this area includes 
a PL/l machine (Wortman [l970]) and a micro- computer capable of emulating 
high-level processes easily (Lesser [l969])o 

Most of these efforts are not directly relevant to the work in this dissertation 
and are thus reported here only for completeness. The common aspect of all these 
designs is that they are concerned with the processing of more highly organized 
information and programs than are found in the conventional von Neumann 
type architectures. Most of them include generalized addressing schemes using 
some modification of descriptors, as well as at least one stack. 

Although the Burks, Warren, and Wright machine was the first to use Polish 
notation as a machine language, the first commercially produced devices to do so 
apparently were the English Electric KDF9 (Davis [i960]) and the Burroughs B5000, 
Both of these machines included stacks. Other related efforts not yet mentioned 
are two machines based on lower- level machine languages, but intended to deal 
with high-level primitives. One of these (Iliffe [1968]) is based on extensive use 
of descriptor logic for both programs and data, while the other (Myamlin and 
Smirnov [1968]) is somewhat more closely oriented toward higher-level languages. 
The latter, in particular, does rxm-time evaluation of infix arithmetic expressions. 

Aside from the work of Burks etalo , none of the designs in the literature seem 
to be derived from explicit mathematical analysis of their input languages. Further, 
except for simulations or actual performance, none of the papers in the literature 
present satisfactory evaluations of their designs. This is not to say that the 
designs are not satisfactory: to the contrary, the success of the Burroughs family 
of computers and the KDF9 show that language- inspired designs are a viable ap- 
proach to the development of new machines. On the other hand, nobody seems to 
have established exactly how viable such designs really are. 
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Do Conclusion 

Having briefly reviewed the developments of language- inspired machine design 
to date, they can now be left in the backgroimdo The present approach is different 
from those in the past in that it is based on a mathematical analysis of the seman- 
tics of the source languagCo Also, the evaluation of the resulting design is analytic, 
and gives a clear comparison of this APL machine to other similar deviceSo There 
are, of course, similarities to the designs of the past. In particular, the use of 
program segments, data descriptors, and stacks is not novel in itself, although 
the machine developed here is substantially different from those mentioned in the 
last section* 



''The thing can be done,7' said the Butcher, ^T. thinko 

The thing must be done, I am sure* 
The thing shall be donel Bring me paper and ink. 
The best there is time to procure. " 



Lo Carroll, The Himting of the Snark 
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CHAPTER II 
MATHEMATICAL ANALYSIS OF APL OPERATORS 

This chapter examines the mathematical properties of some of the APL 
operators. Mathematical definitions of the operators are given from which it is 
possible to deduce their properties. We show that there is a standard form for 
expressions containing selection operators, and that there is a complete set of 
transformations to obtain it. A similar form which generalizes inner and outer 
products is introduced with transformations appropriate to obtain it. Finally, 
the relation between these operators and others in APL is discussed. 

This kind of analysis is important for several reasons o First, in its own 
right it contributes to the understanding of the operators and data- structures in 
APLo Second, and most important for this work, it provides a strong m.athematical 
basis for the design of the machine to be discussed latere In particular, the ideas 
discussed here are reflected in the drag-along and beating processes, which are 
fundamental in the proposed machine designo 

Ao On Meta-Notation 



APL is a programming language, and as such is best suited for describing 
processes, while mathematics is primarily concerned with discussing relations 
rather than processes. Thus, in order to do mathematics with APL, it is neces- 
sary to use some notations that are not available in the language itself. Some of 
these meta-notations are actually extensions of the language which might well be 
included in APL to make it more powerful, while others are necessitated by the 
analytic approach, and do not reflect shortcomings in APL. In the next section, 
definitions of objects not in APL are clearly noted as such. 
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B, Preliminary Definitions 

The definitions to follow are given partly in APL and partly in meta-notationo 
Hence this and the remaining sections in this chapter assume a minimal '^reading 
knowledge" of APL. The APL summary in Appendix A will be helpful to the reader 
not fluent in this language. Also recommended are the APL\360 Primer (Berry 
[l969]) and APL\360 Reference Manxial (Pakin [l968]). At first, it might appear 
that defining APL operators in terms of other (intuitively but not formally defined) 
APL operators is ellipticaL In fact, there is no circularity since the definitions 
could be given in more primitive forms, but at the cost of less perspicuitya Since 
the goal here is not the development of a coherent theory of APL expressions but 
rather the illumination of the behavior of these expressions, the current mode of 
explication was chosen^ The use of "imdefined'^ APL operators is made advisedly 
and no special or esoteric applications of them are made in the following definitions. 
The basic problem here is that of using a formalism to describe a formalismo 
At some point it is necessary to assume a previous knowledge of something in 
order to avoid an infinite regress. 'TSfothing can be explained to a stone; the 
reader must understand something beforehand. " (McCarthy [l964] , p. 7) 

The definitions wiU be numbered Dn for easier reference. Theorems and 
transformations will be numbered Tn and TRn, respectively. In APL expressions 
to follow, the convention that unparenthesized subexpressions associate to the 
right will be used wherever this does not lead to confusion. Material which can 
be skipped in the first reading is enclosed in heavy brackets. For the most part, 
this includes formal statements in definitions which are necessary for proving 
theorems and correctness of transformations, but which are not essential to 
xmderstanding the content of this chapter. 
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DO , Identity : (Meta) If .j4 and SB are expressions, then 

means they have identical values • 

The sign *<-^ * is used for identity because the more traditional equality 

sign '= * is reserved for use as a dyadic scalar operator in APL, 

Dl. Conditional Expression: (Meta) The conditonal expression 

IF B THEN A ELSE C 
has as its value the value of ^ if ^ ^^ l,the value of Cif B ^^^ 0,and is 
imdefined otherwise. 

McCarthy [l963] discusses formal properties of conditional expressions, 
some of which are used in the proofs in this chapter. 

D2 . Index Origin: (Meta) The index origin is the lower bound on subscripts in 
APL expressions. It will be referred to as IQRG. 

In general, this work attempts to show explicit dependencies on index origin. 
However, to do so throughout simply complicates many expressions without adding 
insight. Whenever it is unstated we use 1-origin indexing. 

D3. Interval Function : If /I/ is a non-negative integer scalar, the interval 

function of /!/, denoted hy\N ^ is a vector of length JSl whose first element is 
lORG , and whose successive elements increase by 1. 

[Formally, xN ^ IF 7i/=0 THEN EMPTY VECTOR ELSE (iN-l).N-^ IORG -l.'] 
Thus, one representation for the empty vector is lO. 

D4 . Odometer Function: (Meta) If i? is a vector of non--negative integers , the 

odometer function of R , denoted by li?, is a matrix with dimension (^/R)^qR 
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whose rows are the mixed- radix representation to basei?, of the (x/pi?) 
consecutive integers, starting with lORG. This extension is not a part 
of APL, but is useful for discussing individual subscripts of an array. 
["Formally, for each Teix/i?, ( ii?)[J;] ^ IORG +RtI- IORg A 

Example : i3,2 -«-> 1 1 

1 2 

2 1 

2 2 

3 1 
3 2 

D5 . Row Membership : ELT is a function whose left operand is a vector and 

whose right operand is a matrix, defined as follows: 

L ELT R <-^ IF (pL) = (pi?)[2] THEN v/i?A.=:L ELSE 0. 

That is, the relation is true (has value 1) if and only if the left operand 
vector is identical to one of the rows in the right operand matrixo 

Example : (1,3) ELT i3,2 ^-> 

(2,2) ELT i3,2 <^ 1 

D6. List:(Meta) If i^ is a vector, then the list of L, denoted by ;/!/, is a 
subscript list made up of the elements of L. That is, 

;/L ^ L[l];L[2];...;L[pL]. 

Example: M[;/i53 ^-^ M[l;2;3;4;5] 

D7 . Ravel : The ravel of A^, denoted by,Af , is a vector containing the elements 
of M in row- major order. The dimension is 

p,M ^^~> x/pM 
If Mis a scalar, then ,M is a one-element vector. 
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I^Ottierwise for each lex^/pM, (,«)[!] ^ MC;/( ipM)[J;]]l 



Example: 



,1 3 
5 7 
9 11 

,1.3,5 



^ 1,3,5,7,9,11 



1,3,5 



D8 . Reshape ; Let 7? be a vector of non-negative integers. Then the B reshape 
of M , denoted by R^iM, is an array with dimension R, whose elements are 
taken from M (possibly with repetition) in row- major order. 
Formally, for each L ELT \R, 

(i?pA/)[;/Z,] ^ ( ,M)L lORG +ix/oM) \RxL- IORG l 



t 



] 



Example : (3,2)pi6 -^12 

3 4 
5 6 

Upl,2,3,4,5 ^^ 1,2,3,4 

(2,i+)pi3 ^^1231 
2 3 12 



D9 , Partial SubscriiJting : (Meta) MLlKl Si denotes the partial subscripting 
of array M along the K — coordinate. In other words, 

MLin 53 ^M[;...;5;...;] 

■^ f ■f 

1 K ppM 

Formally, 

pMLLKl 5] ^ (iK-l)ipM),ipS),iK^'pM) 
and for each L ELT ipMLLKl 5], 
if 5 is a vector, then 

(MLIKI 5])[;/L] ^ M[;/((Z-l)fL),5CL[Z3].Z+L] 
and if 5 is a scalar, then 
{MLin S1)L;/L1 ^ MC;/((Z-1)+L),5,(Z-1)4-L] 
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DIP , Subscripting : If A^ is a rank-^ array, then for any 51 ,52 , . . . ^SKMl ,SK 

MLSl;...;Smi;Sn ^ ( • • .( (M[[ppM] 5Z])[[(ppM)-l] 5ZM1] )...)[ [1] 51] 

The above simply gives a formal definition for array subscripting. It looks 
more complex than it really is becatise APL uses a different syntax for subscripting 
than for other operators. If we write SK XlKl M instead of MLLK'] 5] , then the 
value of the above expression can be rewritten as: 

SI Z[l] ... SKMl J[(ppA/)-il SK XLqqMI M 

Dll o J- Function: Let LEI^ be a non- negative integer, ORG an integer, and^eO ,1 . 
Then j LEN^ORG^S is an interval vector of length LEN whose least element 
is ORG; ifS^(^0 then successive elements increase by 1, else they decrease 
by 1, Formally, 

J LEN, ORG, S 

<-^ IF S=0 THEN ORGHiLEN)- IORG ELSE (LEN+ORG-1 ) - ( ( iLEN ) - lORG ) . 

J-vectors are a generalization of the interval function. In particular, J-vectors 
can have any origin, are invariant under changes of lORG , and can run forward 
or backward. 
Example: J ^,2,0 ^^ 2,3,4,5 

J 4,2,1 -^-^ 5,4,3,2 and these relations are true for any lORG . 

D12> Subarray : (Meta) LetM be any array and Fan array with dimension 
CppM),3 • Then the subarray selected byF , denoted FAM, is 

F/W ^ Mil F[1;];J F[2;]; ... ;JF[ppM;]] 
where the elements of Fare assumed to be in the domain of the above 
expression. 
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A subarray selected by this function is compact . The subarray fxmction will be 
used to provide a standard representation for all the various ways of selecting 
compact subarray s. 
Example : Let qM <^ 10,15 

and F <-^ ^ 3 

3 5 1 

then FlSM <-^ MLJ 4,3,0 ; J 3,5,1] 
^ M[3,4,5,6 ; 7,6,5] 

D13 o Whole Array : (Meta) For any array Af , the whole array of M, denoted 
by AAf, produces as a result the F such thatFAM ^«-> M. 

[Formally, AM ^ ^( 3 .ppAf)p( pM) , ( ( ooM)oIORG) , (ppM)pOj 

Example : If pM ^ 6,10,32, then AM ^-> 6 10 

-r..r.^ . 10 1 

and lORG ^ 1 32 1 

D14. Cross Section: (Meta) LetM be any array, F an array with dimension 
(ppM),2 such that 

(i) F[;l]eO,l 

(ii) (-'F[;l])/F[;2] ^ (t/~F[;l])pO 
(iii) (F[;l]/F[;2]) ELT iF[;l]/pM 
Then the F cross section of M, denoted by FAM , is: pFAM -^ (~FC;l])/pM 
andforeach L ELT ipFAM, (FAM)[;/I/] ^ MC ;/(x/F)+('-'FC ;1])\L] 

Cross section is used to formalize the subscripting of arrays by scalar s. The 
first column of F contains zeros for coordinates to be left intact. Condition (ii) 
requires that if F[J;1] -^ then FLJ; 2] -e-^^ . This is primarily to make some 
of the theorems easier to prove. Entries of 1 inF[;l] correspond to coordinates 
indexed by scalars in the corresponding element of F[ ;2] . 
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Example : Let pAf <-^ ^,7,13 



F -^ 1 2 



1 10 



then FLM -^ M[2; ;10] 

D15. Take : If Mis any array and A is an integer vector with qA ^^r^ qqM and 

( \A)<M y thenA^^M is an array of the same rank of M , as follows: for each 
leiQpMy if i4[J]>0 then include the first ALU elements along the J— coordinate 
of M; otherwise if i4llJ]<0then take the last \ALI1 elements. 

[Formally, Am <~^ FAM 
where F ^-> ^(3,ppM)p( \A) AIORGHA<0)>^(oM)-\A) AooM)oO 

D16. Drop : If M and A are as above, then^^Af is similar to the take except that 
for each coordinate, the first (or last)U[i"] elements are ignored. 

[Formally, AiM ^-^ GtM | 

where G <~> ^(3,ppM)p( (pM)- U) ,( JQi?G+OU) .(ppM)pQ J 

D17 . Reversal : If M is any array then (^IKlM is the reversal of M along the K— 
coordinate. 

[Formally ct)[Z]M ^-> HAM j 

where H ^ ^(3,ppM)p(AM)[ ;!],( AM)[ ;2],Z=ippmJ 
If the subscript on the operator is elided, it is taken to be ppM. 

Example : Let M -^ 1 2 3 

4 5 6 
7 8 9 
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then, (2,2)fM ^e^ 1 2 (2, 2)m ^^2 3 

4 5 5 6 

(2,1)>^M ^^8 9 Cl.D^M -e-> 2 3 

5 6 

^[1]M ^-> 7 8 9 
4 5 6 
12 3 

DlSc Transpose : If M is any array and A is an integral vector satisfying 

(i) qA <--^ qqM 

(ii) A/AeiQpM i.e.,^ contains only coordinate numbers of ^^ 
(iii) A/{i[/A)eA i.e., >1 is dense 
then the transpose A^s^M of M by A is defined as follows: 

1. qqAW ^ 1H[/A)-I0RG 

2. For each leiQpA^M, 

(QA^Kn ^ l/{A=I)/pM 

3. For each L ELT ip Am. 

{Am)l\/Ll ^ M[;/LU]] 
In other words, A permutes the coordinates of M. Transpose can also 
specify an arbitrary diagonal slice. 

Example: Suppose M is a matrix with pM -«-^ 5,6. Then if i? h^-> (2,1)^M > ^^d 
lORG ^^ iwehaveppi? ^h> i+2-l ^^ 2 » Further, (pi?)[l3 -^ L/(l=2,l)/5 ,6 -^-^ 6 
(pi?)[2] ^ L/(2 ^ 2,l)/5,6 ^ 5 andforeachL ELT i6,5, i?[;/L] ^ M: ;/( ,L)C2,1]] 

or i?[L[l]; L[2]] ^M[L[2]; L[l3]. 

Thus, i? is the ordinary matrix transpose of M . 

Now suppose M is same as above and R <-^ (l^l)^Af, Then, p pi? -e-> 1+1-1 ^^-^ l. 
So the result is a vector. Then (pi?)[l] ■<-> L/(l=l,l)/5,6 -«-^ 5. 
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Then for each L£i 5, wehavei?[L] -^ M[ ;/( ,L)[1,1]] 

<-^ MIL ; L] 

Soi? is the main diagonal of M. 

D19« Compression: If Z is any vector and f/ is a logical vector of the same 
length, thenu/X is the result of suppressing from J all elements whose 
corresponding entry in f/ is , For an arbitrary array Z, U/LI'] X compresses 
X along the J— coordinate. 

Formally, for vector Z, qU/X -^ +/C/ and for each lex pt/, 

IF [/[J]=l THEN{U/X)U/IW1 ^ Xin 

This is not a constructive formula for ( U/X)lll ; however, such a 

formula is too complex to be useful here. For any arrayZ , 

u/in X ^ JCEJ] [//i(pz)[j]3. 

D20 . Expansion : If Z is any vector and ^is a logical vector with +/^ ^^ pX, 
then U\X is Si vector with elements wherever U has, and whose other 
elements are taken from X in order. 

The definition of expansion is extended to higher- dimensional arrays in 
the same way as for compression. 



[Formally, qU\X <^ p[/ and for each Jeipf/, I 
iU\X)in ^ IF ULII THEN Xl^/IWl ELSE I 



Example : (1,1 ,0,l,0)/l,2,3,4,5 ^ 1,2,4 
(1,1,0,1,0)\1,2,3 ^ 1,2,0,3,0 
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C, The Standard Form for Select Expressions 

In this section the selection operators considered are take, drop, reversal, 
transpose, and subscripting by scalars or J-vectorSo Because of the similarity 
among the selection operators, we might expect that an expression consisting only 
of selection operators applied to a single array could be expressed equivalently in 
terms of some simpler set of operators. This expectation is fulfilled in the 
standard form for select expressions, to be discussed below. 

If the existence of a standard form is to be at all useftil, there must be a way 
to decide whether a particular expression has a standard form representation and 
if so, there must be an effective method to obtain it. In the sequel we show that 
every select expression has an equivalent standard form, and exhibit a set of 
formal transformations which are sufficient to derive the standard form from an 
arbitrary expression. 

It may at first seem strange to include subscripting in the set of selection 
operators, since its parameters are of a different kind than those for the other 
select operators. In the other select operators such as take or drop, the left 
operand is a count, which is independent of ways of accessing the argument array. 
On the other hand, in subscripting the arguments act like maps rather than counts. 
For example, an expression like AiM has meaning out of context, as long as the 
values of A andM are known. Contrariwise the expression A/[l ;3] cannot be 
evaluated without knowledge of the index origin. In the theorems and proofs to 
follow, the major complications often come from this dichotomy in the way of 
specifying select operations, rather than from the actual content of the material. 
Subscripting is included because its effect is similar to the other selection 
operators, all of which change only the dimensions and orde rings of their operands. 
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D21, Select Expression : Let S be any (well-formed) array-valued expressiono 
Then <^is as a select expression on S if it is a well- formed expression 
consisting of an arbitrary number (including 0) of the following operators 
applied to S : 
(i) Take 
(ii) Drop 
(iii) Reversal 
(iv) Transpose 

(v) Subscripting by scalar s of J-vectors 
By extension, we will also include the subarray and cross section operators 
in this class • 

Example : Let M be a rank- 3 array. Then by D21, 

(2,l,3)S?(ct)[2](4r6,3)W)[; ; J6,2,l] 
is a select expression on M , but 

-ML; ; 5,7,3,13 
is not because it contains the scalar operator * - ' and the subscripting is not by 
a scalar or J-vectoro The definition also admits M as a select expression on M. 

D22 , Equivalence Transformation: An equivalence transformation on expressions 
is a rule of the form: 
if set of assertions then S ->^ 

where <^and<^are expressions. If the set of assertions is true, then expression 
<f may be replaced by expression .^, and the truth of the assertions guarantees 
that ^=>^ 

For example (if J^ is any vector then ^^X->X ) is an equivalence transformation, 
since it is always true that if Zis any vector, 4)4)Z -^-> Z. 
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For any given transformation, it is necessary to prove that it is indeed 
eqmvalence-preserving. If this is the case the transformation is said to be 
correct . Note that the notions of expression and transformation and standard 
form used here are informal ones. It is possible to make them rigorous, so as 
to be acceptable to a logician, but that is irrelevant to the current aims and would 
only serve to obfuscate the important mathematical relationships we are trying 
to explicate. The correctness proof for each transformation will be called 
^ Troof of TRn" . 

D23 . Standard Form : A select expression on an array M is in standard form 
(SF) if it is represented as A^F^G^M where A, F,G are all of the correct 
size and domain. 

In the remainder of this section, we introduce a set of equivalence transfor- 
mations sufficient to transform most select expressions into standard form. In 
the process we prove the correctness of each transformation. The effect of this 
process is a proof of the following important theorem: 

COMPLETENESS THEOREM 1: If S is any select expression on an array M, 
then cf can be transformed into an equivalent expression ,^in standard form. 

In order to obtain an SF representation of an arbitrary select expression, we 
must first be able to eliminate the operators take, drop, reversal and subscripting. 
The first four transformations below do this. 

TRl o If ^ is any array and^ is conformable toM for take, then Am => FAM 
whereF ^ ^(3,ppM)p( \A) AlORQ:HA<0)x(pM)-\A)A99M)pO . 
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TR2 . If M is any array and A is conformable to U for drop, theni4+M => FbM 
where F ^ (s?(3,ppAf)p( (pA/)-U),(jaff£+OM),(ppii:f)pO. 

TR3. If M is any array then ^^^KlM -> FLM 

where F -m- (Sj(3,ppA/)p(AM)C;l],(AW)C ;2],Z=ippAf. 

These three transformations are obviously correct, as they follow directly from 
the definitions of the operators take, drop, and reversal. Their proofs will thus 
be omitted. 

TR4 . If M is any array then MllKl J LEN,0RG,S1 => FbM 

^YiereFlK;'] ^ LEN,ORG,S and (i:*ippM)/[l]F ^ (Z;^;ppM)/[l]AM 

That the above is an equivalence transformation requires a small proof: 
Proof of TR4 : 

We must prove that for any array M, 

MLin J LEN,0EG,S1 -^ FAM 
where F is as given in TR4. In order to prove the identity, we show first that both 
quantities have the same dimensions. Then we show that corresponding elements 
of each are identical. 

Let R ^ MLin J LEN,0RG,S1. 

1. By definition, pi? ^ i(K-l)ipM)Ap J LEN,ORG,S),KipM 

^ iiK-l)ipM),LENMpM 
and pFAM -f-> F[;l] 

^ ((K-l)HAM)L;l2),LEN,Ki{AM)l;i:\ 
^^ i(K-l)i-pM),LEN,K^'pM 
^s-^ pi? 
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2. For each L ELT ipi?, 

and (FAM)[;/L] ^ (MEJ F[l;] ; J F[2;3 ; ... ; J F[ppM; ]][ ;/L] 
^ M[(J F[1;])[L[1]]; ... ; ( J FCppM; ] )[L[M] ] ] 
(by L3 in Appendix B). 

But for each I^K, (J F[J;])[LCJ]] ^ (j; (pM)[J],Ja^G,0)[LCJ]] 

^-> L[ J] (by L4, Appendix B) 
andCJ FlK;lKLin'} ^ (J LEN,ORG,S)LLLni. Therefore, 
(FAM)[;/L] ^MCLCl] ; L[2] ; ... ; L[Z-1] ; (J LEN ,ORG.S)LLini; 
LLZ+1]; ... ;L[ppM]] 

-^i?[;/L] SFZ). 

The preceding proof of TR4 is reasonably simple, and is representative of 
the kind of proof required. Although similar in style, the proofs of the remaining 
transformations are more complexo Since they add little to the exposition, they 
are given in Appendix B. 

The following transformation makes it possible to reduce the number of 
occurrances of adjacent subarray operators in an expression, 

TR5. If Mis any array and Fand G are conformable for subarrays, then 

F^G/^M => HAM 
where p^ <-^ pF and for each j£ippM, HLI;'} ^-^ L,OE,S 
where J L,OB,S ^ (J GLI;1KJ FLl;']:i 

Transformations TRl through TR4 are used to eliminate instances of the 
operators take, drop, reversal, and indexing from select expressions by trans- 
forming them into equivalent expressions involving subarray and cross section 
operators. TR5 shows how to coalesce two adjacent occurrances of subarray into 
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one. The remaining transfornaations, TR6 through TRIO are similar in spirit 
and are used to permute the remaining operations into the order required by the 
standard form. 

TR6. If Mis any array and Fand G are conformable, then FAG AM -> G^AF^AM^ 
where G' ^ (-'FC ;l])/[l]6^ 
and F'[;l] ^ F[;l] 
and F»[;2] ^ 
F[;l]x(g[;2]-f((-g[;3])xF[;2]-J0i?g)4-(g[;3]x(g[;l]+jggg+"l-F[:2]))) 

TR7. If Af is any array and F andg are conformable toM for cross section, 
then FAG AM => Em 

where i^C;!] ^ g[ ;l]v(-a;l])\F[;l] 
and ^[;2] ^ ff[ ;2]+(~g[;i])\F[ ;2] 

TR8 . If Mis any array and F^^A are conformable to M for subarray and transpose, 
respectively, then 

FAAm => A^F[i4;]AM. 

TR9 c If M is any array, Q a scalar, JcxqqA^M then 

iAm)llJ']Q'] => IF l=pp^W TJy^;'/!/ Bm ELSE A'^Bm 
where ^4^ -^ (^5^J)M-J<A 
and 5[;1] ^ c7 = ^ 
and 5C;2] ^ ex5[;l]. 

TRIO . If Mis any array and B and A are conformable for transpose, then 

B^Am => C^M 
where C -«-^ ^[.4]. 
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Now that we have transformations TRl through TRIO which are proved correct 
in Appendix B, we can outline a proof of Completeness Theorem 1, First 
note that for any array M, M -^ (ippM)^(AM)A( ( (ppM) ,2)pO)AAf. 

1. Let S be any select expression on M which satisfies the hypotheses of the 
theorem. Apply TRl, TR2, and TR3 to S enough times to eliminate all instances 
of the operators take, drop, and reversal, (In order to be absolutely rigorous, 
we would have to prove a replacement theorem which says that if in an expression 
^, an occurrance of a subexpression,^ is replaced by an equivalent subexpression 
^^ (i.e. , ^<^^^), then the resulting expression^V^ is equivalent to^j^, only 

^' ^^^ . Call the result of this operation S\ Note that S^ contains only 
subscript, A, and ^ operations. Clearly <f'^-> S' because we have appHed 
only equivalence transformations, 

2. Now for each instance of an indexed quantity, substitute the equivalent 
expression using partial indexing, as per definition D10» Write this using the 

IX notation mentioned there and apply TR4 to eliminate all instances of J- vector 
subscripts and call the resulting expression ^". It should be obvious that S^^ 
has the form Si 91 S2 92 . . . SN ON M ^ where the S qtaantities are left operands 
for the operators 9 and the 9 ^s are A , ^ and IX in arbitrary order. Finally 
substitute the expression (ippM)^( AM)A( ( (ppM) ,2)pO)AA/ for /vf^and note that this 
subexpression, call it eS^, is in standard form. Call the resulting expression ^I^, 
and again note that «^^^^ ^ • 

3. Consider the following algorithm: at each step, the input is 

^ ^-^ SI 91 S2 92 ... SK GK ^„, where ^ is in standard form, ue. , 
^ ^ AmFKl^GKm . 

(a) If i^ ^e^ then the algorithm is terminated. Otherwise, look at the operator 
GK. Do step 1, 2, or 3 below depending on whether SZ is^, A or JZ , respectively, 
and return to step (a), 
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1. BK is transpose, is? . Apply TRIO to the expression -J^iS?^ <-^ SKiSfAKis^FK^GKm , 
to get the equivalent QK^FKhGKm, where QK ^ SKlAK^l and call this SL , . 

2. ez is subarray, A • Apply transformations TR8 and TR5 toSMS^ to 

K 

get 3X^,9^ ^-> SKhAmFKhGKm => AmSKLAK;lhFKAGKN^ => AmFK'AGKAM, where FK' 
K 

is obtained by TR5. 

3. 9K is indexing by a scalar, JjCc7] . Apply transformations TR9, TR6, 
and TR7 to SK IXin^^, getting 

SK IXIJ'} AmFKAGKMI => AK' ^BK^FKl\GKm 

=> AK'^FK'ABK'AGKMd 

=> Ar^Fr^GK'm. 

In each of steps 1, 2, 3 above, a set of transformations was applied to the 
subexpression 5i^ 9K^ of e^. Call the resulting subexpression ^_i • Since all 
transformations were equivalence transforms, it is clear that SK 9K^ -e>> ^ 
Let ^_-| be the resulting expression from plugging '9L_^ into .^o Clearly 
^_i ^^^ "^K* Fi^'^lly observe that each^ is in standard form* Hence, in N steps, 
the algorithm will terminate with result ^^^ ^^^ • - • ^^ ^^^^S, and ^ ^^ «^n» 
which is in standard form. This is the desired result. QED. 

So far, we have defined a standard form for a subset of select expressions 
and exhibited a complete set of transformations for obtaining the standard form 
representation of an arbitrary expression in this class. Moreover, the proof of 
the completeness theorem gives an algorithm for obtaining the SF of an expression. 
Note that there are alternate ways of formulating the standard form. For instance, 
an equivalent formulation says that an expression is in standard form if it is 
represented SisA^BiCi(^LKl DMd withB,C non~ negative and K a vector of indices 
so that the definition of ct)[Z] extends in the obvious way. The choice of using 
the meta- notation formulations was made for two major reasons. First, fewer 
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transformations and therefore fewer proofs are needed to establish completeness. 
Second, this formulation is closer to the way these restilts will be used in the 
design of the machine. 

Another point to note is that the standard form could be made more general, 
by allowing more operators to be included in the set of selection operators. In 
particular, compression and expansion might be included, as well as reshape 
and catenation. The general rotation operator at first seems to be a possible 
candidate for inclusion, but in fact does not fit in cleanly. This is prim.arily 
because rotations involve taking residues of subscripts, which do not compose in 
a simple way. A further extension would allow arbitrary indexing of select 
expressions and perhaps extend operations on select expressions to operations 
on their subscripts, as in the case <\>VLSl ^^ VL(\>S']. 

A final point concerns the significance of the SF and completeness results. 
These results are important in that they establish formally some of the relation- 
ships between APL-like operators which informally may appear obvious. This 
not only provides a useful tool for the programmer, who may make formal trans- 
formations on his programs without a second thought, but it also provides a formal 
basis for automatic transformation of programs and expressions. This second 
property is heavily used in the design of the APL machine. Also important is 
that results such as we have described aid in the understanding of array operators, 
which might be used in generalizing them further or in strengthening the theoretical 
foundation for operations on array data. 

D. The Relation Between Select Operators and Reduction 

Obviously there is more to APL than just selection operators. If the residts 
of the previous section are to be generally applicable, we must look into the 
relationships between select operators and some of the other kinds of operators 
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in an array language. One result that has been used implicitly in some of the 
proofs in Section C is that selection operators are distributive with respect to 
scalar arithmetic operators* For instance, (i4+5)[S'] -^-^ ALSl-^BLS^ and 
-(|)7 -^ c|)-F. This property follows immediately from the definition of scalar 
arithmetic operators and the definitions of the select operators, and is stated 
formally in the theorem Tl below: 

Tl. Let A and B be arrays with the same dimensions and M and D be monadic 
and dyadic scalar arithmetic operators and Ta, selection operator; then 
(i) it A D B is defined, 

T (A D B) ^ (T A) D (T B) 
(ii) if M ^ is defined 

T M A <-^ M 1 A 

Tl contains the restriction that^ R ^ andM A be defined, in order to deal 
with cases like ((l,l,l)+i,i,o)[l,2] in which the result is undefined as written 
but is defined after distributing the indexing operator. This result is in fact more 
general than as stated. It should be clear that the operator T can also be rotation, 
compression, expansion (for some scalar operators) or operators such as ravel 
or reshape. A similar result holds if one of 4 or 5 is a scalar. 

One of the most important constructions in APL is reduction which applies a 
dyadic scalar operator between all elements of a vector. Reduction is not an 
operator in the sense we have been using, but is more like a functional. As will 
be shown below, it is possible to change the order of select operators and reductions 
as well as to permute the coordinates of the reducee. As in the previous section, 
these facts will have direct use in the APL machine. The remainder of this section 
defines reduction formally, and presents a set of equivalence transformations 
for expressions involving reductions. 
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D24. Reduction: JlD is a dyadic scalar operator and Fis a vector, then theD 
reduction of V, written D/V, is a scalar defined as follows: 

D/V ^-^ IF (pF)>l THEN FCl] D F[2] D ... D F[ppF] 

ELSE IF (pF) = 1 THEN FCl] ELS E (IDENTITY OF D) 
In the expression above, the operators D associate to the right, as usualo 
The identities of the scalar dyadic operators are listed in Appendix C, 
If Mis any array and D is as above then the D reduction over the^ — 
coordinate of M is defined as follows: 

pD/Ln M ^-> ((Z-l)ipAf),Z>^pM 
and for each L ELT ipD/lKl M 

(D/LKl M)[;/L] ^ D/FN\d 
where F[;l] ^ K^iqqM AND F[ ;2] ^-> F[;1]\L 

If the subscript K is elided in the expression D/lKl M , it is taken to be 
the last coordinate of M, which is ppM in 1-origin and T/ippMin general. 

In order to do some of the proofs reqxxired by this section, we will need to use the 
membership and ranking operators, so these operators are defined formally first. 

D25 . Membership : Ti A is a, scalar and B is any array, then the membership 
relation AeB has value 1 if at least one of the elements of B is identical to 
A , otherwise the value is 0. The dimension of the result is the same as 
that of 4, and the definition is extended element- by-element on A. 

[That is AeB <-^ y/ ... v/Ao.^bI 
qqB times J 

D26. Ranking : If Sis a vector and .4 is a scalar, thenBiA denotes the index 
of Ain B^ namely the least subscript J of B such that^l -^ B[j] . 
[Formally, B\A <-^ l/(A=B,A)/ il+pB.] 
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From the expression above, it is clear that if '^AeB then the result is 
l+r/ip5 • The operation is extended to arbitrary arrays A element- by- 
element. 

[Thus, if A IB any array, then for each ^ ELI. ^P^> I 
(BiAKi/n ^ L/(.4[;/L] = S,A[;/L])/il+pBj 

An interesting question about reductions is under what circumstances can the 
coordinates of the reducee be permuted, with reduction carried out on a different 
coordinate, and still have the result remain the same? It is intuitively obvious, 
for example, that +/[!] M -«-> +/L2] (2,l)^/i^, whenA/is a matrix, since adding 
the rows is the same as adding the columns of the transpose. Theorem T2 shows 
that this kind of permuting can be carried out as long as the coordinates that are 
left after reduction are in the same order. 

T2 » Let A/be any array, ^any scalar dyadic operator, X a scalar, and P any 
permutation of ippA/. Then, 

D/LKl M ^ D/LPLKll P^M 
if and only if 

(PlK'}^iQpM)/PiipP ^> (K^\QpM)/ipQM 

Proof : See Appendix B, 

The complicated condition in T2 is a formal statement of the requirement 
that permutation by P does not disturb the ordering of the coordinates in Mother 
than IQ 

Example : LetM be a rarik-4 array. Then, by theorem T2, all of the following 
are true: 

t/[2]M^+/[l] (2,1,3,4)^M 
^ +/C3] (l,3,2,4)^Af 
^ +/[4] (1,4,2,3)^M 
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No other values of P satisfy the condition in T2o For instance if P <~^ 4,2,1,3, 
P[2] <~^ 2and PiipP ^-> 3,2,4,1. So(2?^l, 2,3,4)73,2,4,1 <-^ 3,4,1 which is 
not (2?^l,2,3,4)/l,2,3,4^-> 1,3,4. This theorem suggests the following trans- 
formation: 

TRllo If Mis any array and 5 is a dyadic scalar operator, then 

D/in M ^ D/LLASTl Am. 
where LAST is the index of the last coordinate of M (ppM for 1- origin and 
r/ippMin general) and A ^ ( iZ-l),M5T,( (Z-l) + i (ppM)-Z) 

TRll above and TR12, TR13, and TR14 to follow can be used to transform a 
select expression on a reduction to a reduction along the last coordinate of a 
select expression. 

TR12 o If M is any array and D a dyadic scalar operator then 

A^D/M => D/{A,l-^[/A)m. 

TR13. IE Ms any array, D sl dyadic scalar operator, then 

GhD/M => D/G'tM 
where G' ^^ ( o^M) o( .G) .C'lioM) ^ lORG .0 . 

TR14. If Mis any array, D a dyadic scalar operator, andg a scalar, 
then (2/M)[[J]e] => D/MLWJQl. 

Proofs of TRll, TR13, TR14 : Immediate from theorems T2, T3, T4, 
Proof of TR12 : See Appendix B. 

Transformation TRll forces all reductions to be along the last coordinate of 
their operand array. TR12, TR13, and TR14 permit reduction to be ^^factored 
out" of select expressions. 
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Given these transformations, we can extend the completeness result of the previous 
section as follows: 

COMPLETENESS THEOREM 2: If <f is an expression on an array M containing 
only selection operators and reductions, then it can be transformed into an 
equivalent expression -^of the toxraD /D /. . *D / ^ ' where the ^ are the reduction 
operators in the order they appeared in S and where .^' is in standard form. 

Since the proof of this theorem is similar to that for the first completeness theorem, 
it win be omittedo Such a proof depends on the correctness of transformations 
TRll throxigh TR14, which follow from the theorems below: 

T3 o If Af is any array, D a dyadic scalar operator then 

GLD/mM ^ D/inO'AM 
where (K^ipqM)/Li1G' ^ G AND G'LK;1 <^ (hM)LK;l 

Proof : See Appendix B» 

T4. For any array M and D a dyadic scalar operator, 

G^D/M <-^ R/G'm 
where G' <~^ ((ppM) ,2)p( ,G),0,0 

Proof : See Appendix Ba 

The following example takes an expression and derives the standard form of 

Completeness Theorem 2. 

Example : Let pM -f-> 6,10,12,19 and consider the select expression with 

reductions: 

S^ (2,l)^+/[l](3,7r^)fx/[4]M 
In each step, we note the transformations applied. 
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1« ^^(2,l)^+/[3](3,l,2)^FAx/[i+]M 

where ^ .^-^ 3 1 
7 10 
4 9 



(TRll, TRl) 



S^ 



+ /C3](2,1, 3)^(3, 1,2)^x/[1|](;am (TR12, TR13) 



(TRIO) 
(TR12) 

by definition of 4 



where (; .f-^ 3 ^^ q 

7 10 

4 9 

19 1 

3o S^ +/[3](3,2,l)^x/[4]GAM 

4, 8^ +/[3]x/[i4](3,2,13)^<^AM 

5o S^ +/[3]x/[t+](3,2,l,4)^GA^AM 

where F -^^ 







The above expression is in SF. 



Eo The General Dyadic Form — A Generalization of Inner and Outer Products 

In APL there are three ways of applying dyadic scalar operators to a pair of 
operandso The simplest, fee scalar product, is the element- by- element application 
of a scalar operator to corresponding elements of conformable arrays e The next 
simplest is the outer product, in which the result is obtained by applying the 
operator to all possible pairs of elements, one from each operand array, in a 
specified order« Finally, the inner product is a generalization of ordinary matrix 
product in linear algebra, except that arbitrary (conformable) arrays may partici- 
pate as operands and any pair of operators may be used. Before proceeding, let 
us present the formal definitions of inner and outer productSo 
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D27. Outer Product : If Mand i^ are arbitrary arrays and D is any dyadic scalar 

operator, then the D outer product of M and f^ , written M o,d N, is defined 

as follows: pM o ,d N ^-^ {qM),qN. Then for each L ELT ipM o .^ /i/, 
(M o.^ /i^)[;/2;] ^ MC;/(ppAf)fL] D El'J{^pM)\L']. 

D28, Inner Product : If Af and ^are any arrays such that "if pM <~^ If pZl/ and if 
D and Fare two dyadic scalar operators, then the^-F inner product of 
M and /\/ written M D.F N, is defined as follows: pM D.F N -^ (^l>^p^f),l^^p/l/ 
and for each L ELT ipM D.F N, {M D.F i\7)[;/L] ^ D/{GhM) F HAN, 
where 6^[;1] ^ ( ("l+ppM)pl) ,0 G[;2] ^ ((""l+ppM)fL) ,0 

HL;ll <-> 0,(^l+ppii/)pl 

^C;2] ^ 0,(l-ppil/)fL 

If one of M or/1/ is a scalar, it is extended to a vector of the same length as 
the reduction coordinate. In the sequel, we assume that all operands of inner* 
product are array- shaped (or have already been extended). 

Example: (1,2,3) o.x 4,5 ^-> ^ 5 

8 10 
12 15 

(1,2,3) r.f 4,5,6 ^ r/(l,2,3)+4,5,6 
^~> 9 
If M SLxidN are conformable matrices, then 

M +.X N 

is the ordinary matrix product of linear algebra. 

Although these three product forms appear to be different syntactically and 
also in their effect, they are in fact intimately related, and can be considered 
as aspects of the same thing. This section shows the close relationship between 
scalar, inner, and outer products, and introduces a new (meta) form which 
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includes these as special cases • We also investigate the effect of select operations 
on this new construction called the general dyadic form (GDF), and show that it, 
like the standard form on select expressions, is closed under application of select 
operations o 

The key to the relationship between these apparently diverse constructions 
is the generalized transpose operation. By applying a transpose to an outer product, 
it is possible to write an expression which specifies a diagonal slice of the original 
outer product. For example, if 1/ is a vector, M a matrix, then the expression 
1 1 2^7o .+M describes the result of adding V to each of the columns of M. it 
would be desi3?able to understand this expression to mean the result it describes, 
namely the result of adding the vector 7 to the columns of Af, rather than the process, 
that is the transpose of the outer product of V and M. The difference is important 
for two reasons. Using the first interpretation in a situation where the expression 
must actually be evaluated, as in a program, requires only the pertinent elements 
of the result to be computed. This is especially important when the operands are 
large arrays. Second, some information is lost by ignoring the partial results. 
For example, the expression ((1,2)^(1, 0))[l] is undefined in the literal sense 
but the apparent intended interpretation gives the value 1. Both in the case of 
select expressions and in transposes of outer products this is a serious problem, 
as it is in direct conflict with the semantics of APL. Formally, the definition of 
the language renders expressions such as the one just mentioned undefined, yet 
this is really a matter of taste and style. My contention is that at worst this 
kind of situation should be an ambiguous one, since it is essentially an instance 
of a side effect. That is, the programmer writing such an expression should not 
depend on the processor of his program to indicate that a domain error occurred 
in the evaluation of an irrelevant partial result. If that is what he wants, there 
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are direct ways of expressing it, such as writing i4-<-( 1 ,2 ) f ( i , o ) , followed by ^[ i ] 
In any case, I have taken the view that what should be evaluated is the intent of 
an expression, if this is perceivable, rather than the literal expression itself. 
Except in cases which produce side effects, both approaches compute identical 
values. 

Theorems T5 and T6 which follow, establish the essential connections among 
the product forms and the transpose. 

T5 p If A and B are conformable for scalar product, and ifD is a dyadic scalar 
operator then A D B- -^ (i\QpA) ,iqqB)^A o ,d B. 

Proof : See Appendix B. 

T6. If Mand /!/ are two arrays conformable for inner product and R and F are 
dyadic scalar operators, then M D.F N <--^ D/A^M o.F N^ 
where A ^-> ( i"l+ppM) ,(2p M5Tl),("n-ppAf) + i'H-pp/l/ 
and LASTl is the index of second- to- last coordinates inM o .f N 
(in 1- origin this is ( ppAf)+(ppff)-l and r/x(ppM)+(ppil/)-l in general). 

Proof : See Appendix B. 

Example : (T6) If ^ andS are matrices then 

A +.X B ^~> +/(1,3,3,2)^A o.x B. 
We can see this as follows: 

(+/(l,3,3,2)^yl o.x B)LI;J1 

^ +/((1,3,3,2)«S?A o.x B1I;J;1 
^ ^/A[I;lxBl;Jl 
^ (A +.X 5)[J;J] 
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In previous sections we have looked into the effect of select operators on 
single arrays and scalar products. A natural question then is, what is the effect 
of the select operators on inner and outer products. In order to approach an 
answer, it was necessary to discover an alternate formulation of these constructions, 
which facilitates this kind of analysis. Such an alternative is the general dyadic 
form, defined below. 

D29 o General Dyadic Form : An expression on two array operands R and 5, 
with dyadic scalar operator D is in general dyadic form (GDF) if it is 
expressed in the form: 

and the following conditions are satisfied; 

(i) /?' and 5' are the standard forms of select expressions oni? andAS. 
(ii) A is Si conformable transpose vector for which each of ( ppi?MM 
and (ppi?' )^Asire in ascending order, and each contains no duplicate 
values, 
(iii) (pA^i?'o.5 5MU] ^ (pi?M, p5' 

The last condition guarantees that if A takes a diagonal slice of the outer product 
R' ° .D S^ , then the length of corresponding coordinates in R^ and 6'' are the same. 
This can always be done by performing a take operation affecting these coordinates 
(see TR17). 
Example : If F is a vector, M and N matrices, then the following are in GDF: 

(1,1,2)^7 o,D M, 

(1,3,2,3)^M 0.2 (2,l)^yV, 

(1,1)^((1,1)W) o.D V 
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but the following are not in GDF because tiie conditions on A are not satisfied: 

(l,3,3,2)i9Af o.^ H 
{.\,l,l)m °.D V 
From definitions D27, D29 and Theorem T5, it is clear that the scalar product 
and outer product of R and S by D are special cases of the GDF, obtained by taking 
A -e-s- ( ippi?),xpp5'and A -e-^ i(ppi?)+pp5, respectively; D28 and T6 indicate that 
an inner product can be expressed as a reduction of a GDF. 

In discussing the effect of select operators on GDF's, we will present a series 
of transformations, with proofs of their correctness in Appendix B. In the following 
transformations, let 

F ■<r^ (ppi?')M and G -^ (ppi?')4-4. 

TR15 . J£W ^^ A«?i?' o,D 5' is in GDF flien ffAf/ => AiS)U o .D F where 
U is the SF of/?" -^ i?[F;]Ai?' 
V is theSF of S" ^ 5[G;]A5' 

TR16. K fi^ is as above and fi is a scalar, then W{.lJ~[Q'} => BW ° -D V 
where B -<-> iJM)/A-J<A and 
U is the SF of IF JeF THEN R'llFiJl Ql ELSE i?' 
Fis theSF of IF JeG THEN S"CC(SiJ] Ql ELSE £" 

TR17. If f/ is as above then BW => (F',G')6jy o .q V 
where F' ^ (MeBLFD/M 

G' -M- (MeBLGD/M M^ i(l/B)+l- IORG 

t/istheSFof i?" ^ (F' iBCF])is?(pSW)CBCF]]+/?' 
^istheSFof S" ^ (C iB[G])(S)(pSW)CB[(?]]+S' 
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TR18 . li A/ and N are conformable for inner product and D axidF are dyadic scalar 
operators, then M D.F N => D/Ais^M' o.f n' 
where A ^-> (i"l+ppM), MSn ,(~l+ppM)+ippil/ 
M' is the SF of M 

N' is the SF of iLASTN,\~'l-tQpN)^N 

LASTl is the index of the second- to- last coordinate of ^ ° •£ ^• 
( (ppM)+(pp2\7)-l in 1-origin; r/i(ppM)+( pp/l/)-! in general) 
LASTN is the index of the last coordinate of ^• 
( p p^ in 1-origin; [ /\qqN in general) , 

These transformations are sufficient to establish: 

COMPLETENESS THEOREM 3: Let S be an expression consisting only of 
reductions and select operators applied to a scalar product, inner product, or 
outer product of expressions e^^and ^, where ^ sund ^ are select expressions 
on arrays ^ and ^ respectively. Then <f can be transformed into an equivalent 
expression ^of the form B. /Dr,/ > . *RTrh^\ where ^^ is in GDF and the Dj. ^s are 

J. Z K J. 

the reduction operators appearing in <f, in the same order. If the original 
expression S contained an inner product, Rj^ is the first operator of the inner 
product. 

Proof : Similar to Completeness Theorem 1. 

F. Conclusion 

This chapter has discussed some of the formal mathematical properties of 
the operators foimd in APL. Of particular interest are the completeness theorems, 
which give conditions imder which a subset of APL expressions can be put into 
standard form. The general idea of the standard form is that sequences of selection 
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operators on an expression can be transformed Into a shorter sequence of opera- 
tions on the same expression. In other words, if <?is an expression and gl , . . . ,5Z 
are selection operators, then there is a process for finding ^4, F, and G such that 

^1 52 ... SKS^r^ A^Ft.GLS, 
Completeness Theorem 3 further shows that, in essence, selection operations on 
inner, outer, or scalar products can be absorbed into the individual operands. 
Also by Completeness Theorems 2 and 3, reductions can be factored out of select 
expressions. 

Clearly, the whole story has not been told at this point; indeed, the contents 
of this chapter barely scratch the surface of the general problem of analysis of 
APL semanticSo Even so, the results discussed are a sufficient base for the 
design of the APL machine discussed in the next chapterSo In particular, the 
analysis here provides a formal basis for the beating and drag-along processes, 
which are the two foundations upon which the APL machine design rests. 
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APPENDIX A 
SUMMARY OF APL 



Monadic form fB 



Definition 
or example 



+B 
-B 
xB 
*B 
B 



0+B 

0-B 

(B>0)-(B<0) 

liB 



FB 



3.14 
""3.1*+ 

•B -<-> 

• *N ^ 



LB 



(2.71828. .)*B 
3.1H 



•0 ^-^ 1 

IB ^-^ BxIB-1 

or IB -«-► Gamma(B + l) 

?B -*--► Random choice 
from iB 

OB ^-»- Bx3. 14159. . . 

'^l <--*■ '^0 -♦—►1 



Name 



Plus 

Negative 

Signum 

Reciprocal 

Ceiling 

Floor 

Exponential 

Natural 
logarithm 

Magnitude 



Factorial 

Roll 

Pi times 
Not 



(-A)oB 


A 


i40B 


(1-B*2)*.5 





(1-B*2)*.5 


Arcsin B 


1 


Sine B 


Arccos B 


2 


Cosine B 


Arctan B 


3 


Tangent B 


(*"l+B*2)*.5 


4 


(1+B*2)*.5 


T^csinh B 


5 


Sinh B 


Arccosh B 


6 


Cosh B 


Arctanh B 


7 


Tanh B 



Table of Dyadic o Functions 



Dyadic form i4fB 



Name 



Plus 

Minus 

Times 

Divide 

Maximum 

Minimum 

Power 

Logarithm 

Residue 



Binomial 
coefficient 



Deal 



Circular 



And 
Or 

Nand 
Nor 



Less 

Not greater 

Equal 

Not less 

Greater 

Not Equal 



Definition 
or example 



2t3. 2 ^-^ 5.2 

2-3. 2 ^-^ ""1.2 

2x3.2 ^-^ 6.4 

2*3.2 ^-^ 0.625 

3r7 ^-^ 7 

3L7 ^H^ 3 

2*3 -«--► 8 

AmB ^^ Log B base A 
A9B ^->- (•B)f«i4 



Case I A\B 



A^O 

i4=0,B>0 

i4=0,B<0 



B-( |i4)xLBTU 

B 

Domain error 



i4IB ^^ ( IB)x(M)xIB-i4 
215^-^10 315 ^-^ 10 



A Mixed Function (See 
Table 3.8) 

See Table at left 



i4¥B 



A 


B 


i4AB 


AVB 


A^B 














1 





1 





1 


1 


1 








1 


1 


1 


1 


1 


1 






Relations 

Result is 1 if the 
relation holds, 
if it does not: 

3^7 ^-^ 1 
75 3 ^^ 



Primitive Scalar Functions 
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Name 


Sign' 


Definition or example^ 


Size 
Reshape 
Ravel 
Catenate 


VqA 
.A 


qP ^-*' ^ pi? •<-»• 3 4 p5 •«-► lO 

Reshape A to dimension V 3 4pil2 *-► F 

12qE "*--► il2 OqE -*--► lO 

,A -^ i>t/pA)QA ,E -•--►112 p,5 *-► 1 

P. 1 2 ^- 2 3 5 7 1 2 'T'.'iyjS* ^- ^THIS' 


Index'^ 


MlAiAl 
AlAi.. 


PC 23 *-*-3 PC4 3 2 1] '♦"►v 5 3 2 

ffCl 3;3 2 1] ^-i- 3 2 1 
11 10 9 
ECl;] -*■-► 1 2 3 *♦ ABCD 
EliU ^-»^ 1 S 9 'ABCDEEGBUKL'tEl *-► EFGH 

IJKL 


Index 
generator3 

Index of 9 


V\A 


First S integers i** ^* 1 2 3 ** 

lO -»--*' an empty vector 

Least index of ^ Pi 3 ^'»>2 5 12 5 

in V, ox l+pV Pi^ -^-^ 3 5 U 5 

1* m^ ^-^ 1 5 5 5 5 


Take 
Prop 


V^A 
V^A 


Take or drop Will first 2 3fX ^-^ ABC 
► (nJDao) or last (KCX]<o) EFG 
elements of coordinate I ""2tP •«"► 5 7 


Grade up99 
Grade- downt^ 


kA 

u 


The permutation which A353 2*-^4132 
• would order A (ascend- 
ing or descending) V3 532^-»>2l3tf 


Compresses 
Expand 9 


V/A 
V\A 


1 3 
10 1 O/P ^-♦2 5 10 1 0/E <-^ 5 i 

9 11 
1 1/Cl]ff -»"► 1 2 3 4 ^-^ 1 1/ff 
9 10 11 12 

A BCD 
1 l\i2^-»' 1 2 10 11 l\-;r ^^ E FGH 

I JKL 


Reverse^ 
Rotate* 


A^A 


DCBA IJKL 

^X ^- HGFE ♦Cl3Jr ^i- 9X ^-^ EFGH 

LKJI ♦P *-► 7 5 3 2 ABCD 

BCD A 
3^P *-► 7 2 3 5 ^-^ "l^P 1 ~1^X ^^ EFGH 

LIJK 


Transpose 


V^A 
^A 


AEI 
Coordinate I of >l 2 l^;r -^-^ BFJ 
becomes coordinate CGK 
Vin of result 1 i^ff ^-^ 1 6 11 DEL 

Transpose last two coordinates ^E *-»- 2 1^^ 




A€A 


Olio 
qV€Y -»-* py EeP -H-». 1 1 
P€\4 ^-^ 1 1 


Decode 
Encode 


VtS 


lOAl 7 7 6 ^-»- 1776 2*t 60 60ll 2 3*+ 3723 
24 60 60T3723 *-► 1 2 3 60 60x3723 *-► 2 3 


Deal' 


S?S 


f/?X *•► Random deal of W elements from lY 



Primitive Mixed Functions 



Restrictions on argument ranks are indicated by: 5 for 
scalar, V for vector, M for matrix, A for Any. Except as 
the first argument of SxA or SlAl, a. scalar may be used 
instead of a vector. A one*element array may replace 2uiy 
scalar. 



2. Arrays used 
in examples: 



P ^"t- 2 3 5 7 



E 



1 


2 


3 


U 


ABCD 


5 


6 


J 


8 


X *-»■ EFGH 


9 


10 


11 


12 


IJKL 



3. Function depends on index origin. 

4. Elision of any index selects all along that coordinate. 

5. The function is applied along the last coordinate; the 
symbols /, V, and o are equivalent to /, \, auid ^, 
respectively, except that the function is applied along the 
first coordinate, if LSI appears after any of the symbols, 
the relevant coordinate is determined by the scalar S. 
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Type of Array 


pA 


QpA 


PPP^ 


Scalar 
Vector 
Matrix 
3-Dimensional 


M N 
L M N 



1 
2 
3 


1 
1 
1 
1 



Dimension and Rank* Vectors 









Conformability 


Definition 


qA 


oB 


oAt.qB 


requirements 


Z^Af.gB 










Z^f/AqB 




V 






Z^f/AgB 


u 








Z^f/AgB 


u 


V 




u-v 


Z^f/AqB 




V W 


W 




ZCJ3^f/i4gB[;J] 


T U 




T 




ZlIl^f/AtlilgB 


U 


V W 


W 


u=v 


ZCJ]^fMgB[;J] 


T U 


V 


T 


u=v 


ZLll^f/ALIilgB 


T U 


V W 


T W 


u^v 


ZLl;Jl<rf/AiI;lgBliJl 



Inner Products for Primitive Scalar Dyadic Fimctions f and g 









Definition 


qA 


qB 


pi4o ,gB 


Z^A o . gB 








Z^AgB 




V 


V 


Zlll^^AgBlIl 


u 




U 


ZCJ>4[J3gB 


u 


V 


U V 


ZlI\Jl<rAlI'\gBlJl 




V W 


V W 


ZUxn^AgBHiJl 


T U 




T U 


ZlIxJl^AlIiJlgB 


u 


V W 


U V W 


Zll xJ iKl^AlIlgBlJ iK'\ 


T U 


V 


T U V 


Zll iJ iKl^AlI '^JlgBlKl 


T U 


V W 


T U V W 


ZLI iJ iK iL']^AlI iJlgBlKiL'] 



Outer Products for Primitive Scalar I)yadic Function g 



Case 



pi? 



Definition 



R^2 
R^l 
R^l 
R^X 
R^2 
R^3 
R^l 
R^l 
R^2 
R^l 



2^M 

2 3*T 
2iSiT 

2iSiT 
2^T 



PV 

pM 

(.pM)l2 1] 

l/pM 

PT 

ipT) 

(pT) 

(pT) 

(L/( 

(L/( 

(L/( 

L/pr 



[1 3 2] 

[3 1 2] 

[2 3 1] 

pDCl 2]).(p!r)C33 

pT)Ll 3]).(p!r)[2] 

pT)l2 3]).(p!r)[l] 



R^M 
RUlJl^MZJiIl 
RLIl^MUill 
R^T 
RlIlJlKl^TLl;KiJ2 
RUiJiKl^TUiKiIl 
RLliJ;Kl^TlKiI;Jl 
RLIlJl^TLIiliJl 
RlIlJl^TLI;JiIl 
RLliJl^TUilill 
RlIl^TlIlIiH 



Transposition 
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APPENDIX B 

This appendix contains proofs for the transformations and theorems which 
were deferred from the main part of Chapter Ho They were omitted from the 
text because they do not substantially contribute to the exposition of the material, 
and are included here for completeness. 

The various proofs are trying to establish the identify of two expressions <f 
and ^ This is generally done in two steps: in step 1, pc^ ^«-^ p^^is shown and 
in step 2, it is shown that the expressions are identical element- by- element. 

Lemmas LI through L9 state results used in the rest of this appendixo Since 
they are all intuitively obvious, and since their proofs follow from the definitions, 
these proofs will be omitted. 

LI . If Af is any array and F is a vector, then 

{Miin 7])C[z] y] ^ Miin 7[[/]] 

L2 „ If Mis any array, I<J , and i/and V are vectors or scalar s, then 

(MC[c7] 7])[CJ] [/] ^ (M[[J3 [/])[[J-0=pp[/] 7] 

L3 o LetM be any array and S1^S2^. . . ^SK be subscript vectors. Then 

for each L ELT\ pM[51 ;52 ; . . . ;5Z] , 

(M[51;52;...;5Z])[;/L] ^ AfC ;/T] 
where r is a vector with r[J] ^-> SICLC J]] 
for each XeippM. 

L4 . For any integral A (scalar or array) satisfying A>IORG and iA-IOBG)<LEN. 

a. U LEN,ORG,0)IA'] ^^ ORG-tA-IORG 

bo (J LEN,0EG,1)LA1 ^ ORG-tLEN^IOBG-^-^l-A 
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c. (J LEN,ORG,S)lA'} ^ OEG+(.i~S)xU-IQMa))+iSx(LEN+IORG+~l-A)) 

d. -J LEN,ORG,S ^ J LEN,(-(. OEG+LEN-1 ) ) , ~5 

e. K+J LEN,OEG,S -^ J LEN ,iOEG+K) ,S if K is an integer 

f. 4)J LEN,ORG,S ^-> J LEN,OEG,'-S 

Ii5 . If FAM is defined, then 

(a) pF/W ^ F[;l] 

(b) for each L ELT ipFAM, 

{FhM)l;/Ll ^ i'^[;/F[;2]+((~F[;3])x(L-J0i?g))+(F[;3]x(F[;l]^J0i?g+"l-L))] 

L6. a, f//ZC5] -^ Z[y/5] 

b. y\[//X ^ yxj (if I is numeric) 

c. U/U\X ^ X 

d. U/V/X ^ (7\i/)/J 

e. ([/A7)/Z ^ iU/V)/(U/X) 

f. y/(Z 2 ^) ^ (y/^) e (f^/^) for 2 a dyadic scalar operator 

g. If C is a dyadic scalar opeiutor with D ^^ 0, 
then U\{X D I) ^ (U\X) D iU\Y) 

L7. K 0<OEG1- IOEG and iORGl-i-LENl-IOEG)<LEN then 

a. (J LEN,ORG,0) IJ LEN1,0EG1 ,SJi -^ J LEN1,{0EG+0RG1-I0EG) ,S 

b. iJ LEN,0EG,1)LJ LEN1,0EG1,S1 ^ J LENl . ( ORG+LEN+ IOEG -(OEG1+LEN1 ) ) ,~g ■ 

L8 . IE yand F are logical vectors with pV -^-^ +/~U 
then ~Cyv(~y)\7) ^-^ (~y)\~F. 

L9 . a. If B is a vector and if for any A,A£B is all ones, thenSCBiA] -«-^ 4. 

b. If P is a permutation of ipPthen if i? -e^ PiipP, PLEl -<-> P[P3 -^ ipPand 
P -f^ i?iipi?. In other words, for permutation vectors, the ranking 
operator is its own inverse. 
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Proof of TR5: 

1. QFhGtM ^M- pFE;!] ^s-> p5AM (by L5) 

2. For each L ELT ipFAGAM, (FAGAM)[;/L] ^ (GhM)L;/Sl 
where -^C-T] ^ (I FCJ;])[L[J]] 

and (ffAM)C;/5] ^ MC ;/2'] 
where rU] ^ (J (?CI;])[5CI]] 

^ (/ ffCJ;])[(c7: F[J; ])[!:[ J]]] 

^ ((J GCJ;])CJ; F[I;]])[LCI]] 
But (5AM)[;/L] ^ AfC;/i/] 
where y[J] ^ (J ff[J;])CL[I]] 

^ ((J GLiilKJ F[Jr;]])[L[J]] 

Thus, r ^ y and (FAffAM)[;/L] ^ (HAM)C;/L]. QED. 
We can give explicit formulas for H in TR5. First, Hl;ll ^r-^ F[ ; 1 ] and 
ff[;3] ^-^ FC;3];^G[;3]. Finally, for each leippM, ffCJ;2] ^ JF 0=G[I;3] 
yggjl/ F[I;3]+g[J;3]- JQi?g FLgg ( JOi?g ++/g[i';l ,2] )-+/F[J;l .2] . 
Proof of TR6: 

1. pFAGAM ^-^ (~FC;lJ)/pGAAf 

^ (~F[ ;!])/(?[ ;1] 

^ ff'C;l3 ^ pG'AF'AM. 

2. For each L FLT ipFAGAM, 

(FAgAM)[:/L] ^ ((?AAf)C;/L'] where L' -^ (x/F)+(~K;l])\L (by D14) 
^ ML;/S1 



- 4:4: 



where (by L5), 

S ^^ g[;2] + ((~g[;3])xL'- JOJ?g ) + (gi:;3]x(g[;l]+ IQJ?g +~l-L') 
^ G[;23+((~G[;3])x(x/F)t((~F[;l])\L)-JaSG) 

+(gC;3]x(g[;l]+JQffS+~l-((x/F)+(~F[;l])\L)) 
(ff'AF'AM)[;/L] ^^ (F'AM)[;/r] 
where r ^ g'[;2]+((~g'[;3])xL- IQj?g ) + (g'[;3]x(g'[ ;l]tJ(9J?g+~l-L)) 

Thus, {G'LF'MdK;/Ll ^ ML;/U2 
where y ^-^ (x/f' )+(~f ' [;l])\r 

^ (x/F')+(~F'[;l])\(g'C;2]+((~g'[;3])xL- IQj?g ) 
+(g'[;3]x(g'[;i]+ Igi?g +"i-L))) 
To complete the proof, we need to show that 5 -<-> f/. By lemma L6g, 

X\A+B ^ iX\A)HX\B), 
and X\AxB ^ (Z\4)x(X\B). 
Thus, writing E ^ ~F'[;i] ^ ~F[;l], and substituting for F» , 

y ^ (F[;l]x(F[;l]xg[;2]+((~gC;3])xF[;2]-JOFg) 

+(gC;3]x(g[;l]+iaSg+~l-f[;2])))) 
+(F\g'[;2])+((F\~g'[;3])x(E\L)-IQflg) 
+ (g\g'[;33)x(g\g'[;l])+ JOi?g +~l-g\£ 
But E\G^l-M ^ Fxg[;Z] ^ (~F[;l])xg[;Z]forZel,2,3. 

Making this substitution and commuting terms, 

U ^ ((FC;l]+~F[;l])x(g[;2]+((~g[;3])x-JQffg)+g[;3]xg[;l]+JQFg-l) 
+((~g[;3])x(F[;l]xF[;2])+(~F[;l])x(~F[;l])\L) 
+g[;3]x(FC;l]x-F[;23)+(~F[;l])x-(~F[;l])\L . 
But F[ ; 1 ] +~F[ ;1] -s-^ (pF[;l])pl and does not contribute to the product in the 
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first term. Also, 

(~F[;l])x(~n;l])\L ^ (~FC;1])\L. 
U ^ G[;2]+((~G[;3])x(x/F) + ((~Z[;l])\L)+IOflG.) 

+GC;3]x(?[;l]+JQffG+~l-((x/F)+(~i^[;l])\L) 
^ 5 QED. 
Proof of TR7 : 

1. pFhGN^ <^ (~F[;l])/pffAM ^M- (~f [ ;1])/(~GC ;l])/pM 

^ ((-(?[ ;l])\~K;l])/pM (byL6d) 

pHm <^ (~A'C;l])/pM ^ (~(ff[;l]v(~G[;l])\FC;l]))/p^ 
^ ( (-(?[;!] )\~K;l])/pM (byL8) 

2. For each L SLT ipF^GAM, 

(FAGAM)[;/L] ^ (GA^)[ ;/(x/F)+(~FC ;1] )\L] ^MC;/53 
where 5^ (x/g)+(~gC;1])\(x/f) + (~FC ;1])\I; 
{HAM)L;/L1 ^ M[;/(x/^) + (~E[;l])\L] ^M[;/r] 
where T ^ ((C?[;l]v(~GC;l])\F[ ;l])x((;[ ;2]+(~(?[ ;1])\F[;2])) 

+(~(ff[;l)v(~ff[;l])\J'[;l]))\L 
Expanding the products, and noting that 

G[;l]v(~G[;l])\F[;l] ^ G[ ;!]+(-(?[ ;1] )\FC ;1] , 

we get 

r ^ (x/G)+(G[;l]x(~G[;l])\F[;2])+(G[;2]x(~G[;l])\F[;l]) 

+(((~(?C;l])\F[;l])x(~ff[;l])\F[;2]) + ((~ff[;l])\~FC;l])\L. 
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So we must show that S <^ T . Jn simplifying T, we use the following, in 
order: If i/and V are logical vectors, 

U^(~U)\X ^ (pU)pO 
iU\X)xiU\I) ^ U\XxY (L6g) 

U\V\X -^ (U\V)\X 
Also recall from the definition of 4 that GL ;2] contains zeros wherever 
(?[;!] does. Thus, we rewrite T: 

T ^ (x/ff)+(GC;2]x(~ff[;i])\FC;l])+((~GC;l])\(x/F))+((~GC;l])\~F[;l])\i 
But the second term goes away because of G^C ;2] 's zeros. 
T ^ (x/(;)+((~ff[;l])\(x/F))+(~GC;l])\(~FC;l])\L 
^ (x/ff) + (~(;[;l])\((x/F)+(~F[;l])\i) 
^ S QED. 
Proof of TR8: 

Clearly the ranks of both expressions are identical, 

1, pFMW^F[;l3 (byL5) 
Now, for each J€ippi4($FU;]AA/ 

(pAiS?FU;]AM)CJ] ^ l/(A=I)/pFLA;lm *^ l/(A=I)/FU;ll 

^ UFIU=I)/A;i:\ (byL6a) 

^' l/(+/A=I)pFLlill ^ FLI',11 -f-> (pFMW)[J] 

2. For each L ELT ipFMW, 

(FMiW)[;/i3 ^ U<s?^)[;/Q3 ^ ML-JQlAll 
where Sti"] ^ («! KI;])CLCI]] 

W(S)FU;]AM)[;/L] ^ (FU;]A^)C;/LC4]: ^M[;/5] 
where Sin ^ (J (FU;])CI;3)C(LU])CI]] 

^ (jj FU[I];])[LU[J]]j 
^ eWCJ]] ^ (Q[/1])[I] 
Thus eU] ^ 5. QED. 
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Proof of TR9 : The case oi(pAisiM) -<-> 1 is trivial and will be omitted. Otiierwise, 

1. ppiAmKLJlQ'i <~^ (ppA>S)M)-l -^ {[/A)-l (in 1-origin) 

pp.4'iS?SAM ^ [/A' ^^ [/iA*J)/A-J<A -f-> [ /{(A^)/A-A<JKL,E,Gl (*) 
where L,E,G exhausts xpA and such that a/4[i]<j and 
aM[£']=J and a/4[(;]>J . (This is possible by commutativity of F .) 
(*) ^ UU*AIL,E,G1)/AIL,E,G1-J<AIL,E,G'\ 

^ r/(((pi4[L])pl),((pA[S])pO),(p4Cff])pl)/(4[I],ylC£'].A[G]) 

-((p4[L,ffJ)pO),(p4[G])pl 
^ \ /AIL^AAIG-]-!) ^ a/AlLl)[([/ALG2)-l 
n J -^ TMthen ALGl ^ lO and the result is UAin ^ (FM)-l. Otherwise, 
AiGl is non-empty and [/ALGl -«-> [/A, so the result is stilKTM)-! , since A 
exhausts ipA, by definition. Thus the ranks of both expressions are identical. 
We now show the dimensions to be indentical. 

For each Ie\i[/A)-1, 
(pA'6)BM)[J] <^ L/(J=A')/pBAM ^ l/(I=(A*J)/A-J<A)/iA^J)/pM 

^ l/((A^)/I=A-J<A)/(A^J)/pM ^ l/(.(A^J)Al=A-J<A)/pM (by L6e) 
By case analysis, we find that 

{A*J)/^I=A-J<A -^ IF KJ THEN I=A ELSE iI-i-l)=A 
^ A=I+I>J 
Thus, (p4'J?BAM)[J] ^^ l/iA=I+I>J)/pM -^ ipA^)lI+I>J2 (by D18) 

and ipUiS)M)aJlQl)Ln ^ { U*\pA) / pAm)in 

^ ipAis^M)L((.J^\pA)/xppA>S)M)Llll 
^ {pAISiM)Ll+I>j:\ ^ (p4'fi?SAM)[I] 
Therefore both expressions have the same dimension. 
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2. For each L ELT \piAistM)LlJlQl, 

((Am)anQiK;/Li ^ (/iw)C;/((^-i)tL),e,(j--i)+i] 

^ M[;/(((J--1)+I),Q,(J--1)+L)M]] 
Call this subscript vector S . 

U'<S?BAM)[;/i:] ^ (BAM)[;/LU']] ^ M[ ;/(x/S) + (~B[ ;1] )\LU' ]] 
Call this subscript vector T. it remains to show that 5 -«-> r. First, 
pS -(-^ pT. For each Ie\pS, 

SLII ^ (((J-l)tL),S,(c7-l)+L)U[I]] 
^ IE Ain<J THEN LLAini ELSE IF AlII^J THEN Q ELSE ((^"-1 )4-L)UCl-J-]] 
So, S ^ (Q^J=A)HJ^A)xLLA-J<A']. 

T ^ (.Q->^J^A)HJ*A)\Ll{A*J)/A-J<A'\ ^ (,Q^J^A)+{J^A)\{J*A)/LIA-J<A'] 
^ {Q-xJ^A)+U*A)^LlA-J<Al 
^ S QED. 
Proof of TRIO: As in the proof of TR9, the hard part of this proof is to show ttiat 
the two expressions BisiiAi^M and BLAliS)Mha.ve the same dimension. 
1. Clearly 5W]W is well-defined since 4 exhausts ipB and pBlAl -f-^ ppM. 
Also, ppBlA^m ^^ [/BIA'] -i-s- r/B -^ ppBiS)A>S)M. By definition of transpose, 
for each Ie\ppB)S)AiS)M, 

ipBm>s>M)lIl ^ l/iI=B)/pAm ^ l/{pAm)\.{I=B)/\ppA>m'}. 
Let us write R -^ AiS)M and T -«-> (I=B)/ippi?. The remainder of this part 
depends primarily on the associativity and commutativity of minimum ( L ) . 

ipBmisiMKii ^ L/(pi?)cr] ^ L/(pi?)[r[i]], (pi?[rC2]],...,(pi?)crcpp!Z']] 

^ L/(L/(A=2'[l])/pW),(L/U=r[2])/pM) {l/iA=TLppTl)/pM) 

^ L/(U=!Z'Cl])/pM),(U=!Z'[2])/pM) iiA=TLppTl)/pM) 

^ L/(U=rci])vu=rC2])v...vu=r[ppr]))/pM 

^ l/(A€T)/pM (by D25) 
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Now-r=BU] ^^ (I=B)LA1 since I is scalar. Also note that ((I=S)U])[X] -^ 1 
if and only if ALKl eT . Thus, I=BU 1 ^ AeT and 
(pBU3<SjW)[J] ^ L/(J=BC.4])/pA/ 

-e-> l/{A€T)/pM ^ (pBiS)A<siM)Lll. 
2. For each L ELT \pBtS)A^M, 

{BiSiAiSiM)L;/Ll ^ Um)L;/LLB12 
^ MLi/aiBlKAll 
^ ftfCi/LCSM]]] 
^ {BUl^MK'JLl 
QED. 
Proof of Theorem T2 ; 

The only if part is easiest, as it depends only on the dimensions of the expressions 

involved. Only if part: 

By hypothesis, 5/ [Z] M ^ D/LPini PW. 

Thus, the dimensions of both expressions are identical. Specifically, 

pD/Ln M ^ ((Z-DfpW), K^pM ^ (K^xppM)/pM 
and pO/[P[Z]] M ^ (PLKl*\ppP>S)M)/pPlsiM 
But, since P is a permutation of xppAf thenpP -«-> ppAf 
and pPiS)M ^-*- (pM)[PiippM] ^^ (pW)[PiipP] 

Also, ppPfijAf -«-> ppAf. Hence, 

pD/LPim M <^ {pm*ippM)/{pM)iP\\pp:\ 

^ (pM)C(P[Z];^ippAf)/PiipP] (*) (byL6a) 
and PR/IKIM ^ {pM)l{K^\ppM)/\ppM^ (**) 

But (*) "^-^ (**) by hjrpotheses. Thus, the subscripts of (pM) are indentical 
for each expression, i.e., 

(P[Z]*ippAf)/PiipP -M- (K*xppM)/ippM. 
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We now proceed with the difficult part of the proof: 
If part : 

1. We must show that pD/LKl M ^ pD/[P[Z3] PisiM, 

pD/LKl M ^-» ((K-DipM), K\qM ^ {K^\qqM)/oM -^ (pMKiK^ippM)/ippM2 
But ppP^M <-^ r/P -«-^ ppAf. So for each lexppM, 

(pPW)[I] ^ L/(P=J)/pM ^ L/(pM)[(P=I)/ippM] ^ (pW)C(P=I)/ippM] 
since P has exactly one element equal to J. 

^ (pM)LPxn (byD26) 

Hence, pP<s?M -«-> (pM)[PitpP3. Now, 
pD/[P[Z]3 PisiM -^ (P[Z]*ippPi9M)/pPW -M- (P[Z]*\ppM)/(pM)[PiipP] 

^ (p/i^)[(PU]^ippM)/PiipP] ^ (pW)C(ii:^ippM)/ippW] 
by h5rpothesis 

^-> p2/[Z] m. 
Thus, the dimensions are identical. 

2. The two expressions are identical element-by-element. 
For each L ELT ipD/in M, (D/LKl Af)[ ;/L] ^ D/Fm 
where P[;l] ■«-> K*\ppM 

and PC;23 ^ PC;1]\L 

(0/[PCis:]] Fm)l;/Ll ^ D/GAPm 
where (3[;1] -s-^ PU];^ippAf 
and G[;2] ^ ff[;i]\L 

Let us examine these two reducees element- by- element. First note that 
they have the same rank. For, pFAW -f-^ {K-\ppM)/pM -*->- (pM)CZ] 
and pGhPm ^ (PLKl=\ppM)/pPisiM 

-«-> (pPW)CPCif]] 

^ L/(P[ii:]=P)/pM 

^ (pM)[i?]. 
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For each leiipM) [Z] , 

(FAA/)CJ] ^ ML; /El 
where i? ^ (x/F)+(~FC;l] )\I 

^ ((if=^ippii^)\L)+(ii:=ippW)\i 

^ (i:,i)[(iii:-i),(ppM),(is:-i)+i(ppM)-xj 

(GAP<s?i^)[l3 ^ (P(S?W)[;/(x/G)+(~GC;l])\I] 

^ (P«s?^)[ ;/( (PU]*ippM)\L)t(PCZ]=ipp^)\I] 

where 5^ ( (L,J)[( iPU]-l ) ,(l+pL) ,(PCZ]-l) + i(pL)-(PCZ]-l) J)[P] 

((L,J)C(iPCZ]-l),(pp;i^),(P[Z]-l) + i(ppM)-PU]J)[P] 
To complete the proof, we must show that P -«-> 5. 
In order to look more closely at ■5,^6 must find out more about P • Let 

T -^ PiipP. 

Then by hypothesis, 

(PLKl^ippM)/T <-^ (K^ippM)/ippM ^-^ (iK-l),K+i{ppM)-K. 

Since ?is a permutation, a/( ipP)eP and we would expect to have ^/i ipT)€T. 
The above equation gives all of r except for the element which equals K. 
There are pT places in T that K could occur, falling into liiree cases. By 
examining each of these cases, we can deduce the structure of P,and thus the 
value of S. 

(a) Pin *^ K. ThenT <^ i\K-l),K,Ki-i{ppM)-K ^ ippM. 
Thus, P -^ ipp^ and S -(-> R. 

(b) PLK'}<K. Then, T ^ iiPin-l) ,K,{(Pm-l)+i{K-l)-{Pin-l)} ,Ki-\(ppM)-K 
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and by lemma L9 

P -«-> TxipT 

^ ( iPin-i ) ,( i+(p[^]-i )+i iK-pin ) ) ,PLn ,k+x(ppm)-k 

^ ( iPLKl-l ) APin+iK-PLKl ) ,PLK1 ,K+iippM)-K 
and then 

s ^ (L,i)i(iPLn-i),((pin-i)+iK-PLn),(ppM),K+i(ppM)-n 

^ iL,I)LiiK-l)AppM),K+iippM)-Kl ^R 
(c) PLn>K. In this case, T^ (iK-l) ,(K+xPLKl-K) ,K,PLn+i(ppM)-PLn 
and P ^ TxipT <~^ (\K-1) ,Pin,i(.K-l)+\PLK']-K) ,PLn + i(ppM)-PLn. 
Then, 5^ (L,I)l{xK-l) AppM) AU-l) + iPLKl-K) ,(PLKl-l)+i(ppM)-PLni 

^ (L,I)L(xK-l),ippM)AK-l) + iippM)-Kl ^ E. 
Hence, in all cases S <->- R and therefore PA/\^ -«-^ G^^M 
for each L ELT xpD/LKl M, 
and thus D/lKl M ^ 0/[PU3] PiS)M. QED. 
Proof of TR12: 

1. The ranks of both expressions are clearly equal. Then, for each IexppA<siD/M, 

{pAIS)D/M)Ln'^ i/(A=I)/pD/M -^ l/(A=I)/~l^pM 
Butalso, for each IexppiA,l+[ /A)iS)M, 

(p(A,l+\'/A)isiM)in ^ l/(I=A,l+\'/A)/pM ^ l/({I=A)/~l^pM)AI=l+UA)riipM 
SO pD/U,l+nA)lS)M <-y ~l\p{A,l'^UA)m -^ pA^D/M 

2. For each ^ MK xpAiSfD/M, 

(AIS)D/M)L;/L'2 ^ iD/M)L;/LLAlil ^ D/FMf 
where ^C;l] '^ ([ /xppM)7tippM -^ ((~l+ppAf)pl),o 
and P[;2] ^ FliUMLAl ^ LU1,0 

(D/U,l+WA)^M)l;/L2 ^ D/GA(A,l+[/A)>S)M 
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where G[;l] -*-> ([ /xppiA,l+[ /A)>S)M)^\pp(A,l+[ /A)<s^ 

^ ((~n-pp(4,i+rM)w)pi),o 

^A(ppA>S)D/M)pl),0 

A typical element of this reducee is 

(GAU,l+r/il)<s?M)[I] ^ (U,l+rM)^M)[;/(x/G) + (~(?[;l])\I] 

^ (U,l+rM)5?M)C;/(L,0)+((pL)pO),l3 
^ AfC;/(i:,J)U,l+rM] ^M[;/LU],J] ^ (FM)Cl] 
Thus, the two reducees are equal. QED. 
Proof of Theorem T3 : 

1. pGm/in M ^ GL;ll 

pD/in G'AM ^ {K^xppM) / pG' M4 

■^ (Z*ippii^)/ff'C;l] ^ ffC;l] ^ pGAO/[Z] M 

2. For each L ELT xpGhD/lKl M, 

(GA2/[^] M)C;/L] ^ (^/[X] M)[;/S] "*-> £/FAM 
where S" ^ G[;2]+((~G[ ;3])x£- J0i?g )+g[;3]xg[ ;l]+ JQi?g +~l-L 
and K;l] ■«-> Z;^ippM 
and F[;2] ^ F[;l]\5 

(D/[Z] G'AM)[;/L] "^ D/F'LG'hM 

where F'[;l] -^-> K^ippGM -^-^ Z^^ippA? and F'[;2] -e^- F'C;l]\i: 
Butby TR6, F' ^G' m -^ G'^hF'm 
where G" ^ (~F' [ ;1])/[1]G' <-> (AM)[Z;] 
and F"[;l] ^^ F'[;l] ^-^ F[;l], 

F"C;2] ^ F'C;l]xff'[;2]+((~G'[;3])xF'[;2]-IQflG)+(?'C;3]xG'C;l3 

+ I0flg +~l-y'[;2] 
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But F'[;l3xF'[;2] -m- F'[;2] 
and for ^£1,2,3. 

F'[;l]xG'[;J] ^ F[;l]\G[;c73 
Thus, distributing the F' C :l ] term and substituting, 

F"L;2] ^ (F[;l]\ffC;2])+((Fi:;i:\(~GC:3]))x(FC;l]\L)-IQffG) 

+ (F[;l]\(?[;3])x(FC;l]\G[;i:)+IQSS+~l-FC;l]\i 
-i-^ F[ ;1]\G[ ;2]+( (~G[ ■,3l)xL-I0RG)+GL ;3lxGl ill+ JORG +~l-L 
-^ F[;l]\5 -f-^ F[;2] 
Hence P" '^ F 

and G"hF^'^ "«-> G"LF^ -^ FM QED. 

Proof of Theorem T4 : 

1. pG^D/M ^ (-(?[;! ])/pD/M ^ (~G[ ;1] )/"l+pM 

pD/G'M -^ ~l\pG'm -^ "l4-(~G'[;l])/pM -m- ~1 + ((~GC ;1]) ,1 )/pW 
-!-> (~GC;l])/**14-pM -M- pG^/M 

2. For each L ELT ipiJAO/Af, 

(GA2/M)[;L] ^ (r/Af)[;/(x/G)+(~(;C;l])\L] -^ D/Fm 
where F[;l] -<->- (r/ippM)*ippii^ 

FC;2] ^ F[;l]\(x/G:)+(~ff[;l])\L ^ (x/(;')+ F[ ;1]\(~GC ;1])\L 
Further, (e/5'AM)[;/L] ^e-> D/F^LG'm -^ D/Em 

where F'[;l] ^ (r/ipp(?'AM)*xppG'AM 
and F'C;2] ^ F'[;1]\L 

and, bj^TRT, flC;l] ^ G'[;l]v(~G'[;l])\F'[ ;1] 
flC;2] ^ G'C;2]+(~G'[;1])\F'[;2] 
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Now for each leippFAM, 

(FAM)[J] ^ W[;/(x/F)+(~F[;])\J] 

^ WC;/((x/G')tF[;l]\(~{?[;l])\L) + (~FC;l])\J] 
^ MC;/((x/G)+(~GC;l])\L),J3 
Since ^[;1] ^ ((~l+ppM)pi),0 
and (~G'C;1])\F'[;1] ^ ((~G[ ;1]) ,1)\F' [ ;1] 

^ ((~G[;l]),l)\("l4-F'[;l]),~ltF'[;l] 
^ ((~G[;l])\(~l+ppff'AM)pl),0 ^ (~ffC;l]),0 
So ffC;l] ^ G'[;l3v(~G[;l]),0 ^ (GC;1] ,0)v(~GC ;1]) ,0 ^ (r/ippM)*ippAf 
ff[;2] ^ (G[;2],0)+((~GC;1],1)\F'C;2] 

^ (GC;2],0)+((~G[;1])\~1+F'[;2]),0 ^ (G[ ;2]+(~t?C ;l])\i) ,0 
and thus (ffAM)CJ] ^ WC ;/(x/H) + (~SC;l])\I] 

^ M[;/(GL;2]+(~G[;1])\L),J] ^ (FAAf)[J] 
and so ^A/if -«-> faw. 
Therefore GhD/M ^ BIG'm. QED. 
Proof of Theorem T5 : There are two main cases. 

a. One of ^ or B is a scalar and is extended to the size of the other operand. 
Supposes! is scalar. Then, A°.'^B'^r^A'D_B,\s^ definition, and 
(ipp4),ippS ^-^ (iO),ippB -M- ippB, which is the identity transpose, and 
similarly if B is a scalar. 

b. A and B are arrays of identical dimension. Then 

1. pp((ipp4),ippB)(5A o.D B ^^ {\ /{xQoA) ,\odB)+1- IORG 

-^ {\ /\odA)+1- IORG -^ ppA 
and for each lexppA, 

(p((ipp^),ippB)6?A °.D S)[J] ^ L/(J=(ipp^),ippB)/(p4),pB 

^ l/iI=ippA)/pA ^ (pA)LIl 
Thus, pA D B ^^ p((ippA),ippB)(s?A <> .D B. 
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2. For each L ELT \pA D B, 

(((ippA),ippB)iS)/4 o,D S)[;/L] ^ U °.D B)[;/L,L] ^ Ali/Ll D S[;/L] 

^ U P B)[;/L] QED. 

Proof of Theorem T6 : 

1. ppAm °.F N ^ ([/A)+1- I0RG ^ [/\ippM)+(ppN)-l ^ 1+ppM D.F N 
For each leipAiSfM °.F N, 

(pAlSiM °.F N)in ^ L/(J=/l)/pM o.F N ^ l/(I=A)/(pM) ,pN 

-H^ IF Ie\~l+ppM THEN (pM)Ll2 ELSE IF J€ (~H-ppM) + i"l+ppiV 
THEN (pN)l2+I-ppM2 ELSE L/(~l+pM) .l+P^- 
So, pi4(sjM o,F N -^ Cl^pM) ,(l-^pN) ,~l^pM 
and therefore pD/A>s^ o.F N -^ ~i-^pAisiM o,f N 

-«-> ("l+pM),l-l'piV -^ pM D.F N 

2. For each L ELT \pM D.F N, 

(M D.F il7)[;/L] -^ D/iG^M) F HM 
where G andff are as in D28. Also, (D/A^ o.F N)L-JL1 ^ D/EMis^ o.f N 
where ffC;!] ^^ ((~l+pp4S?M <>.F N)pl),0 ^^ (ippM D.F N)pl),0 
and El;2l ^ ff[;l]\i ^ L,0 
To complete the proof, we must show that the two reducees above are identical. 
Clearly both have the same dimension, namely ~lipM. 
Then for each -Te i p ~1 f pM , 
((GAW) F HM)Ln ^-> (GAM)Ln F (^M)CJ] 

^ Ml;/(ri+PpM)^L),n F NL;/IA-~1+PPN)^L1 
(EMIS^ o.F N)Ln ^ {A^M o.F N)l;/L,n ^ (M o.F N)l;/(L ,I}LAll 
^ (M o.F N)l;/(Cl+ppM)fL),I,IA-~l+PpN)iLl 
^ ML-JiCl+ppMnL),n F il7L;/J,(-"l+ppiV)+L] 
-*-> ((GAM) F HM!)in 
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Thus, (ffAM) F HM -^ EAA>siM ° .F N , and so theD reductions of each are 
identical. QED. 
Proof of TR15: 

1, The ranks of both expressions are the same since the subarray operator 
does not affect ranks. So for each JeippJv', 

{pAIS)U o,D F)[J] ^M- l/(iI=A)/pU o.D V. 

But pU o,DV^ iHLFilMi') o .d HLG;lAS' - 

^ (pECF;]Ai?'),pff[(;;]A5' 
^ ff[F;l],E[G;l] ^ HlF,G;ll ^ HLA;ll 

Thus, (pAlSiU o.D F)[J] ^ l/a=A)/HUill ^ l/HliI=A)/A;ll ^ ff[I;l] 

and therefore pAiS)U o ,d V -^ Hl;l^ -^ pHKW. 

2. For each ^ ELI ipHAW, 

(HAWK;/L1 ^ (Am' °.D 5')C;/P] ^ (/?' ° .D 5')C;/P[^]] 
^ i?'[;/PCF]] D 5'[;/PCG]] 
where P ^ ff[ ;2]+( (~ffC;3] )xi:-Ja2G)+SC;3]xff[ ;i]+jQffG+~l-L 
Ufi?i/ o.D 7)[;/L] ^ (ff" 0.2 S")l;/LLA11 

^ (HCF;]Afl')C;/L[F]3 D (ff[(?; ]A5' )[ ;/LCG]] 
^ i?'[;/!Z'] 5'[;/T'] 

where T ^ g[F;2] + ( (~g[F;3])xL[P]-i'Oi?g)tg[F;3]xg[F;l]+ I6>i?g +~l-LrFl 
-M- P[F] and similarly, 

2" ^ PEG] 
Then UW o.O F)C;/L3 ^ P'[;/PCP]] D 5'[;/P[G3] ^ (ffAA^)[;/L]. 
Finally, the result is in GDF since U and V are in SF and the value of A still 
satisfies the required conditions. QED. 
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Proof of TR16; 

1. pWim Ql ^ {J*ippW)/pW. To determine pSfijy ° .D V we must first find 

pU o.D V. 

pU ^ pi?"^ IF JeF THEN pi?'CCFiJ"] Ql ELSE pi?' 
There are two cases: 

a. JeF. Then, 

pi?"-M- pR'llFxJl Ql ^ ((FiJ')*ippi?')/pf?' 
^ ((FiJ-)*ippi?')/(pA^)CF3 (byD29) 

^ (pA^)C((FiJ')*xpF)/F] 
^ (pWK(F*J)/Fl 

^ (((J-l)i-pW),(pWKJ:i,JipW)i(F^J)/F:i 
^ i{(J-l)ipW) ,JipW)LiF*J)/F-J<Fl 
since J does not occur in (F*J)/F 

^ (pWLLJl Ql)l(F^J)/F-J<Fl 

b. If -JeF then (F^J) -^ (pF)pl. So in this case, 

pi?" ^ pi?' ^ (pWKFl ^ ipWLLJl Q1)L(F^)/F-J<F1 
So pU ^ (pA/CCJ] Q'])L(F7iJ)/F-J<Fl and similarly, 

pV ^ (pWLLJl Ql)L(G*J)/G-J<G']. 
Therefore, pU o.D V ^ (pfTCEJ] Q'\K{{F^J)/F-J<F) AG^J)/G-J<G1 

^ (pWLUl Q'})LiJ*F,G)/(F,G)-J<F,Gl 
^ (pWLlJl Ql)L(JM)/A-J<A:i 
Then for each JeippSW °.DV, 

(pSW °.0 7)[J] ^ l/(I=B)/pU o.D V 

^ l/(I=iJ*A)/A-J<A)/{pWaJl Q'])l{JM)/A-J<S 
^ l/ipWLLJJi Ql)LiI=UM)/A-J<A)/(J^A)/A-J<Al 
^ (pWLLJl QIKII 
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and thus pBisiU o.d V ^ pWLLJl Ql. 
2. For each L ELT \pWLLJl Ql, 

(wan Q1)L;/L1 ^ Wl;/iiJ-l)^-L),Q,iJ-l)-i'Ll 

^ (i?' o.D sm;/(iiJ-l)^-L),Q,U-l)-^L)Ull 
^ i?'[;/!Z'[F]] D S'li/TLGi:^ 
where T ^ iiJ-l)iL) ,QAJ-1)^L. 

{BtS)U o,D V)l;/L2 ^ (i?" o .0 S")L;/LLB12 

^ 7?"[;/(ppi?")tLCB]] D 5"[;/(ppi?")+LCS]] 
Consider the i?" term above. There are two cases, as before: 

a, ~JeF. Then, 

i?"[;/(ppi?")+L[S]] ^ i?'C;/(ppi?')fLC(J"=^A)M-J'<it]] 

^ i?'C;/L[(ppi?') + (J''^A)M-J'<4]] 
^ i?'[;/L[(J=^F)/F-J<F]] ^ i?'[;/LCF-J<F]] 
^ R'L;/ii(J-l)fL),Q,(J-l)iL)LFl2 ^ i?'[;/rCF]] 

b. J"eF. 

i?"C;/(pp/?")+LCS]] ^ (i?'[[FiJ] Q])C;/LC(~l+ppi?')t(J'*A)M-J-<4]] 

^ (fl'CCFiJ] Q])C;/L[(J;^F)/F-J"<F]3 
^ (i?'C[Fic7] Q3)C;/L[(~l+Fi^)+F],L[(FiJ)iF-l]] 
because F is in ascending order and +/J=F -«-^ 1 
^ i?'[;/L[(~l+FiJ)fF],a,iC"l+(FiJ)+F]] 
^ i?'[;/(((/-l)tL),e,(e7-l)iI)C(("l+FiJ)+F),F[J-],(FiJ)+F]] 

because of^'s order 
^ i?'C;/T[F]] 
And similarly, S'"[;/(ppi?")4-L[B]] ^ 5'[;/r[G]] 
Thus (f/[Cc7] S])[;/i] ^ (B^y o.p 7)C;/L]. 
Finally, it is clear that the result is in GDF since U and v are in SF and B 
satisfies the necessary conditions. QED. 
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Proof of TR17: 

1. ppiF' ,G')^U °.D V ^ {[/F\G')+1-IQRG 

^ {[/i(MeBLFl)/M),(M£BLGl)/M)+l-IORG ^ i\ / iM&BlF ,G'\)/M)+1-I0RG 

^ {[/M)+1- I0RG ^ ([ /i([ /B)+1- I0RG )+1- I0RG 

^ (((r/B)+l-IQg£)+iaflg-l)+l-IOflS ^ ([/B)+1- I0RG ^ ppBW 

For each leippBW, 

(pBiSfW)Lll ^ l/iI=B)/pW 
and (p(F\G')^U <> .D F)CI] ^ L/(I=F' ,G' )/pf/ <> .D V 

^ L/(I=F',G')/(pi?"),p5" 
So we must findpi?" and pS". 

pi?" ^ p(F'iS[F])<s?(pSW)CBCF]]+i?' 
ppi?" ^ (r/F' iS[F])+l -IQffS -H> (r/ipF')+l-IOffS ^ pF' 
Then, for each JeippR'\ 

(pi?")[J"] ^ L/(J'=F'iS[F])/p(pBW)[5[F]]fif' 
^ L/(J=F'iB[F])/(pS«Jfv')CS[F]] 
^ L/(pBW)[(J"=F'iSCF])/B[F]] 
^ L/(pBW)[(F'CJ-]=B[F])/B[F]] 
^ (pSW)[F'[J]] 
Hence pi?" ^ (pBW)[F'] 
and similarly, pS'" ^ (pBW)[ff'], 
and thus (p(F' ,(?' )(S)f/ °.D V)in ^ L/(I=F' ,G' )/(pBW)[F' ,G' ] 

^ i/ipBm)iii=F\Gn/F\Gn 

^ (pSW)[I] 
and therefore p(F',G')W o.p y^ pBJs?{7, 
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2. For each L ELT ipBisjft/, 

(Sfi?i^)[;/i;] ^ (/?' o.O 5")C;/L[BU]3] 

^ i?'[;/(ppi?')tL[B[y4]]] 2 5'C;/(ppi?')+i:[BU]]] 
^ /?'C;/L[BCF]]] 5'[;/LCBCG]]] 
((F',(;')^i/ o.£ 7)[;/L] ^ (i?" o.£ 5")[;/L[F',ff']] 

^ i?"[;/L[^']] 2 5"[ ;/£[(?•]] 

So we must calculate the i?" and S " terms above. 

i?"C;/Z;[F']] ^ ((F'iBCF])6?(pSW)[BCF]]+i?')[;/L[F']] 
^ ((pBW)[B[F]]-hfl')[;/i[F'CF'iB[F]]]] 
^ ((pBW)CS[F]]ti?')C;/i[S[F]] 
^ i?'C;/L[S[F]]] 

since ^ IMl ^pB<?^ 
impHes-^tBCF]] SLT i(pBW)C5[F]] 

Similarly, 5"[;/L[G']] ^ S'[;/L[B[G]]J 

Thus, ((^',G')W °.0 F)[;/L] ^ i?'C ;/ICSCF]]] 2 5'C ;/LCBC(;]]] 

Finally, observe that the result is in GDF since U and V are in SF and F ' and 
G' are in order and contain no duplications by construction. QED. 
Proof of TR18: 

Immediate from T6. 
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APPENDIX C 
IDENTITY ELEMENTS 
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CHAPTER m 
STEPS TOWARD A MACHINE DESIGN 

Never do today what you can 
Put off till tomorrow. 

William Brighty Rands 

procrastination is the 
art of keeping 
up with yesterday 

Don Marquis, archy and mehitabel 

As demonstrated in Chapter H, there is a high degree of power and internal 
consistency in the APL operators and data structures. This makes it possible to 
write simple expressions which have the same semantic content as several state- 
ments in comparable programming languages* This chapter discusses how to 
exploit these features in the design of an APL machine. 

In general, APL programs contain less detail than corresponding programs 
in languages like ALGOL 60, FORTRAN, or PL/l. For instance, the maximum 
value in a vector, 7 , of data can be expressed as \ /Y in APL while ALGOL requires 
the following: 

MAX :=sniallestnumberinmachine ; 
for:=l step 1 until N do 
if v[l]>MAX Jhen MAX:=V[I]: 
While this aspect of APL often makes programs shorter and less intricate than, 
say, ALGOL programs, it also requires that an evaluator of APL be more complex 
than one for ALGOL, especially if such expressions are to be evaluated efficiently. 
On the other hand, a machine doing APL has greater freedom since its behavior is 
specified less explicitly. In effect, APL programs can be considered as descriptions 
of their results rather than as recipes for obtaining them. Further, the language 
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renders many of these descriptions obvious, both to the human reader and to a 
machine, as in the case of f/F, while other languages encode them so intricately 
that the original intention of the programmer is hiddeuo In the example above, 
an A PL machine can choose any method it pleases to find the maximum value 
while an ALGOL machine doesn^t know what result is expectedo 

This featxire of APL also has some drawbacks in that some expressions for 
results require tmnecessary computations if calculated literally as writteuo For 
instance, the expression 3^(2x-f) specifies a result which is the first 3 elements 
of twice the negative of ^ ^ Presumably the programmer is only interested in these 
three elements* However, the literal interpretation of this expression proceeds 
as follows: 

lo Negate Y (and store it somewhere), 

2e Multiply the previous result by 2 (and store it)o 

3. Take the first 3 elements of the last result. 
In case V is large, this process is grossly inefficient. The negation requires ( p7) 
fetches and stores as well as ( pF) spaces for the value to be stored. The multi- 
plication requires another(p7) fetches, stores, and multiplies o In fact, the 
desired residt could have been foimd simply by negating the first three elements 
of 7 and multiplying by 2, Clearly, we would like the APL machine to be able to 
evaluate such programs efficientlyj 

A. Drag- Along and Beating 

One approach to efficient and natural evaluation of APL expressions is to 
exploit the mathematical properties of the language to simplify calculationSo In 
the machine, this approach is embodied in two fundamental new processes: drag- 
along and beating . 
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Drag-along is the process of deferring evaluation of operands and operators 
as long as possibleo By examining a deferred expression it may be possible to 
simplify it in ways which are impossible when only small parts of the expression 
are available. In effect, drag- along makes the machine context-sensitive, while 
most machines are context-free. 

Consider the drag-along evaluation of the example in the last sectioua If we 
assimie a stack machine, the machine code for this expression might be 

lo LOAD V 

2„ NEGATE 

3o LOAD 2 

4, MULTIPLY 

5, TAKE 3 

The immediate execution of this sequence was already shown. Suppose now that 
we temporarily defer instructions in a buffer instead of executing them as they 
appear. After the first instruction, the buffer contains 

LOAD V 
After instruction 2, we have 

LOAD V 

NEGATE 
where the pointer connects the negation with its deferred operand, V, After 
instruction 4, the buffer contains 

LOAD V 

NEGATE 

LOAD 2 ^ 

MULTIPLY J 

The evaluation of the TAKE is different from the previous operators since it is a 
selection operator, TAKE can examine the contents of the buffer and change them, 
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^ 





as beloWo Note that the deferred expression is equivalent to the original expression. 
The process of making the changes in the buffer is called beating, 

LOAD 3f V^ ) (Note change in this instruction) 

NEGATE ^ 

LOAD 2 

MULTIPLY 

When values must finally be computed, only the desired elements will be accessed 
and usedo Thus, drag-along facilitates beating. 

The other aspect of drag- along is that it eliminates intermediate array- shaped 
results with consequent savings of stores, fetches, and space. In an expression 
such as A+B-^CW the literal execution proceeds in three steps: 

T2^B+T1 
T3^A+T2 
If the variables A^B^C^D are vectors, each step above requires a vector-sized 
temporary store and the last two steps require fetches to get the previous results 
as operands. With drag-along, the entire expression is deferred finally to be 
evaluated element- by- element as: 

for j^i step 1 until p^ do 
T3[J>A[J]+5[J]+C[J]+Z}[J] 
This requires no extra fetches, stores, or temporary space to obtain the desired 
result. 

In the machine, drag- along will be applied to all array operands S' and ^and 
to all monadic and dyadic operators MOP and POP for which 

(MOP^)ll/Ll ^ M0PHF1S)[_:/L1 
and 

iS D0P^)L;/L1 ^ (Fl<f)[;/L] DOPl^ (F2^)[ ;/L] 
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where Fl and ^2 are simple functions of arrays and MOP^ and DOP^ are similar to 
MOP and^OP . An example of a function which is not dragged- along by the machine 
is grade-up which is essentially a sort of its operand. Grade-up obviously does 
not fit into the above scheme since Fl also becomes a sorting fimction which is 
not simple as required. 

B. Beating and Array Representation 

Beating is the machine equivalent of calculating standard forms of select ex- 
pressions. 11 the effort to do beating followed by an evaluation of a standard form 
is less than that to evaluate an expression directly, then the process is worthwhile. 
We will see in the following chapters that this is in fact the case. 

In order to apply beating we must specify a representation of the standard 
form. One possibility is to maintain the ^,F, and G values for each array in an 
expression to allow calculation of the standard form 

A^FAGhM 
as defined in Chapter n. However, these arrays contain redundant information 
and it is desirable to find a more compact representation. 

If we choose to represent arrays in row- major order we can utilize the rep- 
resentation of the storage access function as the representation of standard forms. 
In this way, beating will consist of applying the transformations of Chapter n to 
the mapping fxmctions for arrays. 

In the following discussion we can assume without loss of generality that the 
index origin is zero. Situations where it is different reduce to the zero case by 
subtracting 1225 from all subscripts. I^etA be a rank-^ array. Then, assuming 
that each element ini4 is to occupy one word in memory, the element 4 [ ;/L] will be 
located at 

VBASEHqA)iL (*) 
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vjhereVBASE is the address of i4[0;0; . . . ;0]o Thus, subscripts of arrays stored 
in row- major order are representations of numbers in a mixed- radix number 
system (Knuth [l968] p. 297). This representation is especially suitable for arrays 
in APL because APL arrays are rectangular, dense, and homogeneous. Further, 
this representation does not favor any array coordinate over another which is 
essential in APli« 

We can generalize the access fimction slightly by writing it in the form: 

VBASEi-ABASE+-h/DELxL ( ** ) 

where ABASE is an additive constant, in this case zero, andDEL is the weighting 
vector used to calculate the base value in (*) above. DEL is computed by 

DELLN']<-1 

Z)£'LCJ>P5'L[J+l]x(p^)[j+l] for each IeiN-1. 

Example: LetM be a matrix with dimension 2,3. Th.enDEL<~^3,l and we set ABASE<~^0 . 
The layout of M in memory is 



VBASE 



+1 +2 +3 +^ +5 



M[0;0] 



MC0;1] 



M[0;2] 



AfCl;0] 



MCl;l] 



M[l;2] 



Given this formulation of the storage access fxmction, it is only necessary to 
transform ABACS' and DEL in order to obtain the effect of evaluating selection opera- 
tions on an array. 

Example : KM is the matrix in the previous example, then the mapping function 
for ( 2 , 1 ) W has the same VBASE, For the transpose we use ABASE ^^^0 and DEL '-^1 , 3 . 
ISfote that the change in DEL corresponds to permuting it by 2,1. This new function 
uses the same values that were stored forM, but accesses them as if they were 
the transpose (2,1 )^M. To verify this, note that the address for ( ( 2 ,1 )^M)lI;Jl 
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is 

VBASE^ABASE'-^-^-fDEL'^I^J <-^ VBASE+ABASE' Hl><I)H3xJ) 

^-^ VBASE+ABASEH 3 x J ) +( 1 x J ) 
^-> VBASE+ABASE++/DELXJ ,1 
which is the location of M[J;J3 ^ ((2,1)^M)[I;J^] . 

This can be done for any selection operator by using transformations analogous 
to those in Chapter 11. Appendix A shows the beating transformations on access 
functions for arrays. In the machine, beating is also applied to expressions con- 
taining reductions, scalar operators, and inner and outer products, based on the 
results in Chapter 11. 

C. Summary 

At this point we have outlined the framework of a machine for APL. It is 
pleasing to know that it will work since it is justified by theoretical results 
developed earlier. The remainder of this dissertation discusses the structural 
details of a machine based on the beating and drag-along processes and gives an 
evaluation of its effectiveness. Let us outline some goals that such a design should 
satisfy: 

lo The machine language should be close to APLo That is, it should contain 
all primitives in the language and in a similar form. While it is well-known how 
to design a machine to accept APL directly there is no particular advantage to 
doing so. We are primarily concerned with processing the semantics of the 
language, not its syntax. Thus there is no loss of generality in letting the machine 
language be a Polish string version of APL. This has the further advantage of 
freeing the machine from the particular external syntax of APL, 
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2o The machine shoidd be general and flexible. In particular, it should 
not be so deeply committed to evaluating APli as to be useless for other purposes. 

3. The machine should do as much as possible automatically o This includes 
storage management, control, and simplification of expressions. The programmer 
should not have to be aware of the structure and internal ftmctioning of the machine 
at a level much beyond that specified in an A PL program. 

4. The machine should do simple things simply and complex tasks in pro- 
portion to their complexity. In other words, the work required for the machine 
to execute a program or expression should be related in some straightforward 
way to the program's complexity. 

5. The machine should be efficient. This is perhaps the most important 
focus of this work. Of course, the question of efficiency is related to the current 
technology; at present, a major bottleneck in evaluating array-valued expressions 
is use of memory. Thus we concentrate on reducing memory accessing and tem- 
porary storage space in the evaluation of APL programs. 

6. The machine design should be elegant, clean, and perspicuous. 
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APPENDIX A 
TRANSFORMATIONS ON STORAGE ACCESS FUNCTIONS INDUCED BY 

SELECTION OPERATORS 

1. The storage access ftmction for an array M contains the following information: 

RANK ^ ppM 

RVEC -<-> PM 

VBASE location of first element of ,M 

ABASE constant term of access polynomial 

DEL vector of coefficients of access poljntiomial 

Then, the element ML;/L1 is located at 

VBASE+ABASE++/DELXL 

2. This section lists the transformations on storage access functions which are 
used to effect beating of selection operators. These transformations are given 

as program segments written in index origin zero. It is assumed that the parameters 
to the various selection operators are conformable and in the proper domain. 

a. QAM 

ABASE ^ ABASE+DEL+.x(Q<0)xRVEC-\Q 
RVEC ^ \Q 



b. QIM 



ABASE ^ ABASEWEL+ . x ( Q>0 ) x | ^ 
RVEC ^ RVEC- I Q 



c. ^jJ l^M 



ABASE ^ abase+delukrvecij:^-!) 

DELLJl -e -DELLJ2 
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d. ,^4^ 



R <■ RVEC 

D ^ DEL 

RANK ^ 1H[/A) 

I -s- 

DEL ^ RANKWEL 

RVEC ^ RANK\RVEC 

RANK REPEAT 
BEGIN 

RVECin ^ i/(I=A)/R 
DELHI ^ +/(.I=A)/D 
I ^ I+l 

END 



e. MLLJlS CA LARl 



ABASE ^ ABASE^DELin^SCALAR 
DEL ^ (J*\RANK) /DEL 
RVEC <- {J^\RANK)/RVEC 
RANK -t- RANK-1 



f. MilK']J LEN,0RG,S1 



ABASE ^ ABASEWELLKl^ORGHLEN-l) 

RVECIKI ^ LEN 

IF 5=1 THEN DELLKl ^ -DELLKl 
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CHAPTER IV 
THE MACHINE 

This chapter contains a functional description of a machine designed to process 
the semantic content of APL programs. 

In general, the description will be given in English, although algorithmic 
descriptions will be used as necessary to provide clarifications. The section will 
be written in the style of a programming manual, with the addition of explanations 
and rationales as requiredo 

The APL machine (APLM) is conceptually composed of two separate machines, 
each with its own language, sharing the same registers and data structureSo The 
D- machine (DM) accepts APL- like machine code and does all the necessary analysis 
on expressions. The DM produces code for the E- machine (EM), and in the process 
does some simplification of incoming expressions using drag-along and beating. 
The E-machine does all the actual computations of values in the system. By using 
a stacking location coimter based on the organization of machine code into segments, 
the overall control scheme for the machine is qtiite simple* 

The current chapter consists of five sections which present the APLM in a 
logical sequence. Section A discusses the data structures and other manipulable 
objects in the machine, and explains how they are managed in the machine's 
memory. Section B continues by explaining the stacks and other registers in the 
machine, followed by a discussion of the overall machine control, in Section C, 
Finally, the details of the D- machine and the E-machine are set forth in Sections 
D and E, respectively. Examples are used liberally throughout, to clarify opera- 
tional details of the APL machine « 
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A. Data Structxires and Other Objects 

The manipulable objects in the machine fall into three main classes: data 
values, descriptors and program segmentSo This section will describe these 
three kinds of objects and how they are represented in the machinCo 

Scalars are the simplest kind of datao In APL, a scalar is an array of 
rarik-O, in practice, a scalar is a different kind of object than an array, and is 
so treated in the machinCo Alttioiigh arrays are stored in the memory, M, of the 
machine, scalars are noto They appear only in the machine registers, in particular 
the value stack, and as immediate operands in a code string* In a real machine, 
scalars would have an attribute of type , determining the kind of representation to 
use for encoding and decoding them* In this work, we will assume that this is 
handled automatically, and that all scalar data are the size of a single machine 
wordo 

The most important data structure in the APLM is the arrayo The represen- 
tation of an array is divided into two parts. The first is the value array which is 
a row-major order linearization of the elements of the array* The second part 
is a descriptor array (DA) for an array, which contains the rank, dimension, and 
storage mapping fimction for the array. This separation makes it possible to have 
multiple DA^s, not necessarily identical, referring to the same value array, which 
makes beating possiblCo In this chapter, descriptor arrays will be shown in the 
form: 

@ARR RC-2 LEN=05 

+01 VB=VARR AB^OOO 
+02 RANK=2 

+03 R(l)=003 D(l)=02 

+04 R(2>=002 D(2)=01 

@ARR is the address in memory of the first word of the descriptor array for the 

array named ARR, which is shown above* The first word contains a reference 
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count (RC) and a length (LEN) field, as explained in the discussion on memory 
in the APLMo The rank of the array is recorded in the third word of the DA; 
words after that contain the elements of the dimension vector, labeled R(I). Thus 
in this case, pARR is 3, 2„ The second word in the DA encodes the base address 
of the value part of the array (labelled VB for VBASE) and the constant term in 
the storage mapping function (here labelled AB for ABASE)o Finally, the DA 
contains the coefficients of the storage mapping polynomial, DEL (labelled D(I) 
here)o Recall that for an array ARR, the element ARR[;/L] is located at 

VBASE + ABASE + +/DEL x (L - lORG); 
This formula is the storage mapping function for any arrayo 

In addition to array descriptors, the machine contains descriptors for 
J-vectorSo Recall from Chapter II that a J-vector is a vector of consecutive 
integers which can be specified by a length, an origin, and a direction bit. We 
assume that these three quantities can be encoded into a descriptor by the 
fimction JCODE(length, origin, direction) and that there are appropriate decoding 
functions, (See Appendix Ao ) 

Finally, programs in the machine are represented internally as program 
segmentSo A program segment is any sequence of machine commands and operands, 
and is referenced by a segment descriptor o Segment descriptors contain an 
encoding of the beginning address of a segment (relative to the beginning of the 
function they are a part of) and the length of the segment* There is also a bit 
which indicates the execution mode for the segment (see Section C)o 

Each defined function (program) is a segment, and logical subparts of the 
function may also be represented as segmentSo As will be seen later, it is easy 
to activate and de-activate segments in the APL machineo Briefly, the advantages 
of organizing programs in segments is that these are the logical units of a program. 
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while other organizations, such as paging, do not allow this kind of natural cor- 
respondence of form and function (pardon the punl). An important property of 
APLM instructions is that they contain no absolute addresses except for references 
to NT, which remain constant in any compilation. All internal references to 
other parts of a program are relative , Thus, all programs are relocatable. 

Each function has a corresponding fimction descriptor , which is similar to 
a DAo A function descriptor contains the following information: 

FVBASE location in M of beginning of function segment 

FLEN length of function segment 

FIORG index origin for this function 

FISR logical variable -^1 if fimction has a result 

FPARS number of parameters 

FLCli total number of local names 

In addition, the rest of the fimction descriptor contains a list of all local names 
in the fimction, in the order: result (if any), parameters (if any), local variables 
(if any). The function descriptor for a function is used in calling and returning 
from functions, as will be discussed in Section Do 

Main memory in the machine is a linear array of words named M, The only 
objects which reside in M are arrays, DA^s, and program segments. All other 
objects are stored in the machine* s registers. In addition to M, there is an array 
NT, the Name table , which is an abbreviated symbol table. Every identifier in the 
active workspace has an entry in NT, which contains descriptive information and 
either an actual value or a pointer to where it can be found in M. Scalars and 
J- vector descriptors are stored directly in NT. Thus, all references to variables 
and functions in the machine go through the NT. This organization allows for 
dynamic allocation and relocation of space in M, without having to alter any 
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program referenceSp The operation of NT is described more fully in the next 
section under machine registers. Constant array values within a function are 
stored as part of the program segment; they are addressed relative to the beginning 
of the function, and so, too, remain relocatable. 

Within M, two different allocation mechanisms are used, one for fxmctions 
and array values, and one for descriptor arrays. The reasons for this are that, 
because of drag- along and beating, DA^s are expected to have a shorter lifetime 
than functions or array valueSo Further, in a given fxmction, locally at least, it 
is likely that DA^s will be of similar sizes. Thus, it is feasible to keep an 
available space list for DA's, with the hope that erased spaces can be reused 
intact. We would therefore expect more efficient use of M by DA^s than by array 
values. 

The free memory space (M) is arranged as follows: functions and array 
values are allocated from the lowest address (BOTM) towards the top of M and 
DA^s are allocated from the top (TOPM) down. The space in the middle is the POOL, 
with boxmdaries BOTP and TOPP, Each entry in M has a header word containing 
an encoding of a reference count (see Collins [1965]), the length of the entry, and 
a filler count. The latter field is used when space slightly larger than necessary 
is allocated. Each time a reference to an entry is added or deleted, the reference 
count field is adjusted. When a reference coimt goes to zero, meaning that there 
are no uses of the entry anywhere in the system, the entry is made available in 
one of two ways. If it is adjacent to the POOL, it is merged with POOL, Other- 
wise, it is added to the appropriate availability list, of which there are two, one 
for DA's and one for functions and array values. 

The availability lists are doubly linked, and each entry contains a header 
similar to those for active entries. When space is needed, the appropriate 
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availability list is searched using the first-fit method (Knuth [l968] 436, ££)• If 
a fit is found, the space is allocated and the availability list adjusted. Otherwise, 
space is taken from the POOL. H a request for M-space is made which cannot 
be honored because there is not enough contiguous space available, a garbage 
collection is madCo The two halves of M are garbage- collected separatelyo In 
collecting array space, all the DA's are scanned and a linked list is set up which 
ties together all DA's pointing to the same entry. Then arrays are compacted 
towards BOTM, with the links used to adjust the VBASE fields in the referent DA's. 
If enough space is still not available, the DA's are also compacted, using a 
similar algorithm. Some coalescing of available space is also done by the al- 
location algorithm, GETSPACE. Figure 1 illustrates how M is structured. 

B. Machine Registers 

This section describes the registers and register-like structures in the APL 
machine. The present description covers only the logical functions performed by 
these registers and does not make any demands on how they are actually to be 
implemented. Although most of the registers are not directly accessible to the 
programmer, thorough knowledge of their use is important to understanding the 
functioning of the machine. 

There are several registers related to memory accessing and allocation. 
The most important of these is the Nametable , NT. NT is an associatively ad- 
dressed stack, each entry of which contains a name field, a tag, and a value. 
The name field of an entry contains an index for the identifier associated with the 
entry. Permissible tags in NT are ST, for scalar quantities, JT, for encoded 
J-vectors, UT, for undefined identifiers, DT, for arrays, and FT for functions. 
Sr and JT entries contain the actual value in their value field, while DT and FT 
entries have descriptor addresses in their value fields. 
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When a function is called, an entry is pushed to NT for each of the function's 
local variables and parameters, as listed in the function descriptor. Similarly, 
when a function is de- activated, the reverse process occurs. Each time a variable 
is accessed, NT is searched associatively from the top (latest entry). If a hit is 
not found, then the desired variable must be global, and it is entered into NT. 
This mode of maintaining the NT makes identifier behavior correspond to APL's 
^'dynamic block structure^^ and facilitates recursive fimction calls. 

The most important registers in the APL machine are four stacks. The use 
of stacks permits elimination of addresses from most instructions and simplifies 
the evaluation of recursive and nested programs. 

1. Value Stack (VS) 

VS is the main stack in the machine and is used in the evaluation of expressions 
and in function calls. Each VS entry consists of a tag and a value part, as in NT 
entries. In addition to scalars and function or DA pointers, VS can contain segment 
descriptors, partially- evaluated addresses, function marks, and names. 

2. Location Counter Stack (LS) 

Recall that machine code is organized into segments, characterized by a 
starting address and a length. Each LS entry contains the starting address of a 
segment (ORG), its length (LEN), a relative coimt, pointing to the next instruction 
to be executed (REL), and control information. Each time a segment is activated, 
its beginning address and length are pushed to LS, and the REL field is set to zero. 
The address of the next instruction is then determined from the REL and ORG fields 
on the top of LS. After each instruction fetch, the REL field at the top of LS is 
incremented. When this value is equal to the length of the segment, the segment 
is terminated by popping the top of LS, thereby reactivating the next entry. The 
control information in LS is used to coordinate it with the other stacks in the machine. 
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3. Iteration Control Stack (IS) 

Array-valued APL expressions implicitly specify an index set for the expres- 
sionso In this machine, IS is used to control (nested) iterations over this index 
set in the element- by- element evaluation of array-valued expressions. The 
operation of IS is coupled with LS as follows: when a set of iterations is begun, 
the limits of the iteration are pushed into the iteration stack, and a segment is 
activated containing the range of the iterationSo Then, for each instruction in 
the code segment, the necessary index values are taken from IS. When the segment 
is completed, the entries in IS are stepped and if the required iterations are not 
exhausted, the segment is re-initialized and repeated with the new IS values. 
Eventually, the iterations are completed and the segment in the range also is 
completed, in which case IS and LS are both popped, returning the machine to the 
place it was to resume after the iterated code was completed. (See Section D. ) 

The IS behaves essentially like a nest of FORTRAN DO's. Each entry contains 
a coimter (CTR) (to origin zero), the maximum value of the coxmter (MAX), 
direction bit (i. e. , coxmt up or down) (DIR) and control information. Although 
the IS is partially accessible to the machine code, it is for the most part main- 
tained automatically. Like LS, IS could probably be incorporated into the value 
stack, since these three stacks generally work in parallel. However, by separating 
these stacks by their fimctions, the machine design becomes cleaner and more 
perspicuous. 

4. Instruction Buffer (QS) 

Unlike LS and IS, the instruction buffer QS is logically separate from the 
value stack. QSis not strictly a stack, since it is possible to access and alter 
information at places other than its top. In the D- machine, instructions are 
fetched from M, some of which are executed immediately, and others of which 
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are either evaluated by beating or are deferred in QS by drag-along. In entering 
instructions in QS, the DM may change other related QS entries. When the 
E-machine is activated, instructions are fetched from QS and executed directly, 
generally in conjunction with VS and IS, QS contains operation and value fields, 
similar to VS, a LINK field used to reference other deferred instructions, and 
an AUX field, which is a logical vector acting as an access mask for array entries 
(see Section E). 

A final four registers in the machine are mentioned primarily for completeness. 
These are: 

lORG Index origin of current active function 

FBASE Base address in M of current active function 
FREG VS index of fimction mark for current active function 
ISMK IS index of topmost IS entry containing 1 in its MARK field. 

The use of these registers is shown in the examples in following sections. 

C. Machine Control 

The purpose of the APL machine is to transform a set of data (the input) into 
a second set (the output) according to encoded transformation rules (the program) 
which are interpreted according to a predetermined scheme (the machine). This 
entire process is called the evaluation of the program and input. 

In the APL machine, programs are evaluated in two separate but related sub- 
machines. The D- machine takes its instructions from main memory, M, in the 
form of Polish APL code, and does all the necessary domain testing and storage 
allocation for the various operands. In addition, the DM does simplification of 
incoming expressions by drag-along and beating. The output of the D- machine is 
values in VS and transformed code in the QS, in the form of instruction segments 
for the E-machine. At critical points, determined either by the programmer and 
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the DM, control is passed to the E-machine, which executes the simplified 
instructions in QS, producing values in VS and Mo When done, the EM passes 
control back to the DM, which resumes where it left offo 

The division of labor between the two submachines is logically similar to that 
between a compiler and its target machine. The DM plays the role of the algebraically 
simplifying compiler, whose source language is essentially APL, and whose 
target language is E-machine codeo The E-machine as the target of theDM^s 
transformations is a conceptually simple computer which does nothing but compute 
valueSo Given this scheme, a question which naturally arises is, Why bother with 
the D- machine at all? Why not use a separate compiler in software and let it 
produce code for a machine similar to our E-machine? Unfortunately, this is 
impossible, since the behavior of the D- machine is dependent not only on the 
source code (program), but is also dynamically dependent on the data. For instance, 
consider a simple APL expression such as A + B. We would like the source code 
for this expression to be something conceptually like 

LOAD B (i. Co , "load" B to the value stack) 

LOAD A 

ADD (i. e. , add the values on top of the value stack and leave the 

result the rep) 
The problem here is that we would like the machine to do different things depending 
on the data. In particular, if both A and B are scalars at the time the above code 
is executed, it would be desirable to have the LOAD instructions push the actual 
scalar values to the stack, and to have the ADD do the actual additiouo But if A 
and B are conformable arrays, the desired action is to defer the entire operation 
(both LOADS and the ADD) in the instruction buffer, to be performed later by the 
E-machine. 
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No compiler woiild be able to make these decisions a priori xmless it knew 
what data was to be used in running the program, or imless variables were suf- 
ficiently restricted by declarationSo Further, much of the work done by the D- 
machine is domain testing, including rank and dimension checking, on dynamically- 
specified variableSo Since this process is data- dependent, it must be performed 
dynamic ally e 

Both the D- machine and the E-machine share all the registers and the memory 
of the entire APL machine. Further, both are controlled by a central cycle 
routine, shown in Figo 2o The key to the overall control of the APLM is the 
location coimter stack, LS, which contains active segments for both the DM and 
the EMc In Figc 2 we see that a major machine cycle takes the form: 

a. Check to see if the current active segment has been completedo If not, 
proceed to step b, otherwise see if this segment is under control of the 
iteration stacks If it is, then step the iteration stack; in case IS does not 
overflow, then reset the REL field to the beginning of the segment and 
repeat this step* If the segment is not under control of IS or if it is and 
the iteration stack overflowed, then de-activate the segment and repeat 
this step. 

b. Calculate the effective address of the current instruction and update the 
location counter stack. 

c. Select the appropriate machine, determined by the D/E bit in the current 
active segment. If the DM is selected, then defer any arrays referenced 
on the top of the value stack to the instruction buffer; also, fetch the 
instruction and (if necessary) the second word of the instruction from 
memory. Finally, decode and interpret the instruction and return to 
step a. 
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FIGURE 2~ Maincycle routine. 
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D. The D- Machine 

The D- machine evaluates programs written in "machine language" by generating 
instructions in QS to be executed later by the E-machinCa As discussed in Chapter 
III, the use of a Polish string for the machine language rather than "raw" APL frees 
the APLM from the particular concrete syntax of APli without sacrificing any of the 
semantic content. 

Most of the instructions in the APLM correspond directly to the APL primitives; 
those which do not are the control instructions, which comprise a more powerftil 
set in the machine than are provided in the source language. All operands in DM 
instructions are either relative addresses within the program segment or are NT 
references or are immediate values. As a result, all programs in the machine 
are relocatable. Since only constant data is contained in function segments, 
programs are likewise re-entrant. 

The D-machine instruction set is listed in Tables 1-1, 1-2, and 1-3. The 
instructions are divided into three classes: storage management instructions, 
control instructions, and operator instructions. It is clear from Table 1 that no 
systems functions are included in the D-machine' s repertoire. In a real imple- 
mentation of an APL machine, these instructions would have to be provided, 
although for the current work, they are irrelevant. The remainder of this section 
discusses the instructions of the D-machine, with examples to clarify the details. 
Oo A Guide to the Examples 

The examples used in this chapter include program listings, register dumps, 
and memory dumps. In showing program excerpts, we generally also show the 
APL source expression, and give values, or at least attributes, for the operands. 
Programs are shown in assembly language format, except that absolute addresses 
are given. Although nothing has been said of the manner in which D-machine instructions 
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TABLE 1-1 
Storage Management and Control Instructions 



Opcode 



Operand 



Description 



A. Storage Management Instructions 



LDS 


scalar 


LDSEG 


seg-descr 


LDJ 


jcode 1,0, s 


LDIS 


K 


LDCON 


K 


LDN 


N 


LDNF 


N 


ASGN 




ASGNV 




B. 


Control Instructions 


JMP 


K 


JMPO 


K 



JMPl 
LEAVE 

RETURN 
ITM 
DO 
DOI 



K 



Load scalar 

Load segment descriptor 

Load J-vector 

Load iteration stack coimter, K from top of IS 

Load constant array, starting at FBASE +K 

Load name N 

Load name N and fetch value 

Assign (and discard value) 

Assign and leave value 

Jump by K (signed) in current segment 

Jump by K in current segment only if top 

of VS is 

Pop VS in either case 

Same as JMPO except test for 1 

De-activate this segment 

(io Co , pop LS and also IS if necessary. ) 

Return from current function 

Iterate and mark 

Call E- machine to work on top of VS 

Same as DO except that temporary space is 

allocated for the result, if any, and the result 

is left on top of VS 
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TABLE 1-2 






Scalar Arithmetic Operators 


Operator 


_APL 


Definition 


A. Dyadic 






ADD 


+ 


Add 


SUB 


- 


Subtract 


MUL 


X 


Multiply 


DW 


JL 


Divide 


MOD 


I 


Modulus 


MIN 


L 


Minimum 


MAX 


r 


Maximum 


PWR 


* 


Power 


LOG 


® 


Logarithm 


cm 


O 


Circtilar functions 


DEAL 


? 


Random deal 


COMB 


T 


Binomial coefficient or beta function 


AND 


A 


Logical and 


OR 


V 


Logical or 


NAND 


A 


Logical nand 


NOR 


V 


Logical nor 


LT 


< 


Less than 


LE 


< 


Less than or equal 


EQ 


= 


Equal 


GE 


> 


Greater than or eqtial 


GT 


> 


Greater than 


NE 


^ 


Not equal 


B. Monadic 






PLUS 


+ 


Plus 


MINUS 


- 


Minus 


SGN 


X 


Signum 


RECIP 


■T 


Reciprocal 


ABS 


1 


Absolute value 


FLOOR 


L 


Floor 


CEIL 


r 


Ceiling 


EXP 


* 


Exponential (base e) 


LOGE 


® 


Logarithm (base e) 


PI 


o 


Pi times 


RAND 


9 


Random number 


FAC 


1 


Factorial or gamma function 


NOT 


*>j 


Logical not 
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Operator 



TABLE 1-3 
Remaining Operators in D- Machine 
APL Definition 



Selection 



TAKE 




+ 


Take 


DROP 




4- 


Drop 


REV K 




<\>LK1 


Reverse along K— coordinat 


TRANS 




«? 


Generalized transpose 


mx K 




Lin 


Index on K — coordinate 


B. 


Evalxiated Immediately 




BASE 




1 


Base value (Decode) 


REP 




T 


Representation (Encode) 


GDU 




4 


Grade up 


GDD 




t 


Grade down 


CAT K 




> 


Catenate (top K on VS) 


RAV 




* 


Ravel 


URHO 




P 


Dimension 


DRHO 




P 


Restructure 


UIOTA 




I 


Interval 


C. 


Deferrable 






ROT K 




cj)CZ] 


Rotate on K— coordinate 


EPS 




e 


Membership 


DIOTA 




I 


Rank 


CMPRS 


K 


/in 


Compress on K — coordinate 


EXPND 


K 


\LK1 


Expand on K — coordinate 


SUBS K 


c 


Subscript with K expressions 


D. 


Compound 







RED K OP 
GDF OP 



op/in 



Reduce along K— coordinate by OP 
General dyadic form with OP 
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are encoded, we have chosen, for purposes of illustration, to show them as one or 
two word quantities, depending on whether or not they have operands. All operand 
addresses are shown symbolically and comments are used to explain the program 
structxirco In the register dumps, most of the material is self- explanatory o Field 
headings are summarized in Appendix Ao The top of each stack is indicated by an 
arrow. Descriptor array addresses, which are pointers to the memory, are in the 
form @A, for variable A, and value addresses in M are of the form VA. Again, in 
the real machine, these would in fact be numerical addresses, but the symbolic 
form is much clearer for examples. Fields in DA's are labelled mnemonically. 
Segment descriptors in VS or QS are shown in the form SCODE(SEG«X, m), where 
m is or 1 depending on whether the segment is a DM or an EM segment, and X 
is the segment symbolic name (arbitrary), EM segments are delimited by ''brackets" 
along the right side of the QS display, in the format XY, meaning that segment X 
starts here and segment Y ends here. The LINK field of QS contains relative pointers 
and is interpreted according to the opcode. The contents of the AUX field is to be 
interpreted as a logical vector, although in fact it may be encoded differently in an 
actual APLM, 
1, Storage Management Instructions 

This class includes all instructions concerned primarily with the storing and 
fetching of data. Each of the load instructions pushes a value to the value stack. 
Of these, four have immediate operands; LDS, LDSEG, LDJ, and LDN push their 
operands to VS with tags ST, SGT, JT, and NPT respectively, LDIS K loads as a 
scalar the current value of the CNT field of the iteration stack element K entries 
from the top of IS, LDNF N refers to variable N in the nametable, and enters the 
current value of the variable (from NT) into VS, In the case of NT entries with tag 
DT (i,e. , arrays), the reference count of the DA is increased by 1 when it is 
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entered into VS, and the VS tag is set to FDTo The LDCON K instruction is used 
to access a constant array stored in a function segmento Its operand K is a pointer 
relative to the function origin pointing to the beginning of the DA for the constant 
valuco This DA is copied to the DA area of M, its VBASE is set to the beginning 
of the function (FBASE), and its ABASE is set to K. The DA pointer is pushed to 
VS with tag FDT. 

Although all the load instructions just described push a value to VS, such 
values do not always remain there* At the beginning of each D-machine cycle, the 
top of VS is examined for tags FDT, DT, and JT (see Figo 2)o If one of these is 
present, then the entry is deferred in QS, because it is array-valued. This is 
done by pushing an E-machine instruction to QS of the form 

OP @ARR MASK* 

OP is IFA, lA, or IJ, depending on whether the VS tag was FDT, DT, or JT; 
@ARR is the DA pointer that was in the VS value field, and MASK is an access 
mask. The access mask in this case is a logical vector whose last K bits are 1 
when ARR is a rank-K array* It will be used by the DM in beating and by the EM 
in accessing this array* The LINK field in E- machine instructions of this type is 
unused, and thus is shown as above* The VS entry is then replaced by a segment 
descriptor with tag SGT pointing to the one-word QS segment containing the deferred 
operand* In general, this entire process is invisible in the examples below, and 
the load instructions which generate array values can be thought of as doing the 
deferral themselves* 

Although ASGN and ASGNV are operators, they are included as storage 
management instructions because they have the side-effect of causing values to 
be stored* These instructions expect the top of VS to contain a destination, either 
as a name (tag NPT) or as a QS descriptor pointing to a segment containing only 



- 92 - 



TABLE 2 



Interpretation of ASGN and ASGNV in the D- Machine 



Top of VS 



(Top-1) of VS 



Action 



a. tag = NPT or 
tag= SGT and 
deferred ex- 
pression has 
one element 

be ta€=NPT 



c. tag=NPT 



d. tag=NPT 



Co tag= SGT and 
deferred seg- 
ment consists 
of a QS entry 
with opcode lA 



tag=ST 



tag= SGT and 
deferred segment 
is a J-vector 

tag=SGT and 
deferred segment 
is a single DA 
with reference 
count of 1 and 
value also has 
reference count 
of 1 

tag= SGT and 
deferred segment 
is any arbitrary 
array expression 

tag= SGT and 
deferred segment 
is any arbitrary 
array expression 



Do immediate assignment. That is, store 
the scalar value in NT or in M, as appro- 
priate. 



Do immediate assignment. 



Do immediate assignment. 



Allocate space for a DA and value of the 
size necessary to store the result. Defer 
the assignment in QS, as for scalar arith- 
metic operators. 

Check ranks and dimensions for conformahilityo 
If the Ihs variable is a J-vector, it must first 
be explicitly evaluated. If the rhs expression 
contains instances of the Ihs variable with dif- 
ferent permutations, then the rhs expression 
is evaluated to temporary space. Finally, 
the assignment is deferred as above. 
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an lA instruction; the second entry in VS is the right-hand side of the assignment. 
There are several possible actions taken by the DM in interpreting assignments, 
depending on the VS contentSo These cases are explained in Table 2o We have 
assumed that '^evil'^ side effects do not appear in the code; their treatment is 
straightforward, but uninterestingo Also, it should be noted that althotagh the 
strategies outlined in Table 2 could be modified to alter the machine's performance, 
the case analysis remains the same. 

The final storage management instructions are INPUT and OUTPUT, which 
are left further unspecifiedo These could be conceived of as read-only and write- 
only (serial) strings, which are used as primitives for writing functions such as 
D andD o 
2o Control Instructions 

The control instructions of the APLM are all concerned with directing the 
flow of control among statements at the source- language level, and are all evaluated 
by the D-machine« 

The three jump instructions, JMP, JMPO, and JMPl are used to alter the 
flow of control among statements in a fxmctiono Since no jumps are allowed out- 
side of a function, there is little difficulty in specifying this operatiouo All that 
is necessary is to change the value of the relative pointer in the current segment 
on LS. CYCLE is a special case of JMP, which sets the relative pointer to 0, 
causing the current (D-mode) segment to be repeated, LEAVE pops LS and also 
IS, if the segment is involved in an iteration^ RETURN performs similarly 
in returning from a call on a fimctioUo In addition, it automatically erases the 
locals for the current function from NT. 

The interpretation of the DO instruction depends on the top value on VSo If 
the top of VS is a scalar then the DO acts as a no-op. If the tag is SGT, then the 
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segment described on VS is activated by pushing the segnaent descriptor to LS, 
with VS being popped. In case thetagis NPT, the corresponding NT tag is examined, 
and if the tag is FT, then the named function is activated, as described in the next 
paragraph; all other cases are no-opSo The DOI instruction is similar to DO 
except that if the top is VS and has tagNPT, the value referenced is copied to new 
space, while if the tag is SGT, temporary space is allocated for the restdt and 
the segment is evaluated. Thus, after executing a DOI, the top of VS contains an 
entry with tag &r, JT, or FDTa 

When a DO instruction encounters a fimction name on top of VS, the following 
actions take place: 

lo The function descriptor, referenced by the NT entry for the function, is 
fetchedo It is expected that all parameters to the function have been evaluated 
and placed on top of VS, so that the topmost value is the leftmost parameter. The 
parameter count, FPAR, in the function descriptor is fetched, and the top of VS 
checked to see that there are that many values already there. If not, an error is 
signaled. Otherwise, the machine goes through the list of local variables in the 
function descriptor, making an entry in NT for each one. Each new tag in NT is 
set to UT, for undefined, imless it corresponds to a parameter. Parameter values 
are placed in NT and popped from the value stack in order, 

2, A function mark entry is pushed to VS, with tag FMT containing an 
encoding of the current values of FREG, lORG, and the name of the function being 
activated, 

3, lORG is set to the value in the function descriptor, and FREG is set to 
the VS index of the function mark, 

4, An entry is pushed into LS for the segment described by FVBASE and 
FLEN in the function descriptor, FBASE is initialized to FVBASE, and the process 
is completed, 
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The segment just activated contains all the code for the functiono When a RETURN 
is executed within this function, the following occurs: 

lo LS is popped, thereby de-activating the functiono 

2, The function name, encoded in the function mark onVS, is used to access 
the function descriptor and then popped. If there is a result, the value is pushed 
to VS, and its NT entry erased. All other NT entries for locals in the function, 
together with their values, are also erased. 

3o FREG and lORG are restored from the values in the fxmction mark on VS. 
The fxmction mark is deleted and the result, if any, is moved into its place. 

4. Finally, FBASE is set to point to the current active function (if any) by 
accessing its function descriptor through its name in the newly-exposed function 
marko 
3. Operator Instructions 

The operator instructions correspond to the primitive operators in APL. 
They can be considered in four groupings, and are so discussed in the rest of this 
section. Part a discusses the scalar arithmetic operators (Table 1-2); part b 
contains a description of the selection operators which are evaluated by beating 
(Table 1-3A); part c describes those operators which are generally executed 
immediately (Table 1-3 B); and part d covers remaining deferrable operators as 
well as the compouad operators (Table 1-3C,D). 

a. Scalar arithmetic operators 

If the top of VS contains two scalar values (or one if the operator is monadic) 
then the operation is done immediately, leaving a result in VS and popping the 
operand(s). This process is illustrated in Example 1. In fact, the operation is 
pushed to QS and the E-machine is activated to perform the actual evaluation, but 
this micro-process is invisible to the user. 
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The other possible cases occur when the top two elements of VS are segment 
descriptors for deferred code in QS or when one is a segment descriptor and the 
other is a scalar. If one of the operands is a scalar, it is entered into QS and its 
VS entry is replaced by an appropriate segment descriptor, reducing it to the 
case of two segment descriptors in VS« 

The D-machine compares the ranks and dimensions of the two operands for 
conformability and signals an error if they don't match. Otherwise, the operation 
is deferred by drag- along in QS and the top of VS adjusted so that it contains a 
segment descriptor pointing to the entire deferred expression in QSo Because of 
the stack discipline in the machine, the deferred code for both operands will 
always be contiguous in QSo The link field of the QS entry for the operator (with 
opcode OP) is a relative backwards pointer to the earliest deferred operand in 
the deferred subexpressiouo The AUX field is the same as the AUX field of the 
two operands (see Example 2)« 

bo Selection Operators 

The selection operators are evaluated in the D-machine by beating, the process 
of performing a selection operation on an array- valued expression by changing 
the storage mapping functions of its constituent array operands„ The mathematical 
analysis of Chapter II legitimizes this approach, and guarantees that the trans- 
formations used in beating produce the correct results. Before proceeding, let 
us define what it means for an array-valued expression to be beatable. 

An array- valued expression deferred in QS is beatable if any of the following 
conditions apply: 

(i) It is a single QS entry with opcode IFA or IJ. 
(ii) It is a consecutive pair of QS entries of the form 
S scalar 

IKD ptr R « 
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EXAMPLE 1 - SCALAR OPERATURf SCALAR OPERANDS 



REGISTER DUMP 

NEWIT * lOHG = C FREG = OOCCO FflASt « 00200 



REL ORG LEN 0/E IS FN NWT QP 

j_S; 4- ♦ 4. 4. ♦ «. ♦ ♦ «. 

I 010 t 000 I IOC I 1 I 1 I I 00 i 

— > I 



EFFECTIVE AOOR * 0210 IN M 



TAG VALUE 

VS:» ♦ 

I .. i 
I ST 1 256 
I ST I 32 
— >l 



OP VALUE 



--■I- QS:*- 
I — > I 

I 
I 



LINK AUX 



EXAMPLE i - SCALAR OPERATOR, SCALAR OPERANDS 



REGISTER OUMP 

NEWIT = lORG = FREG « OOOOC FdASE * 002C0 



REL ORG LEN D/E IS FN NWT QP 
LSi ♦ ♦ ♦- ♦ ♦ ♦ ♦ ♦ — --♦ 

I OU i 000 ) ICO I I C I 1 I I 00 I 
1 001 I 000 I 031 I I I I J I 00 I 

— > I 

EFFECTIVE AOOR = 0001 IN OS 



TAG VALUE 

VS:* ♦ 

t .. I 

t ST I 288 

— >l 



UP VALUE 

-+ ys: ♦ ♦ 

I 00 I OP I ADO 
I — > I 



LINK AUX 
-♦ ♦ ♦ 

t t I 



CD 
00 



EXAMPLE l-l: BEFORE EXECUTING AOO AT M<210) 



REGISTER DUMP 

NENIT = lORG = FREG « OOCOO FBASE = 0020C 



EXAMPLE 1-3: AFTER E-MACHINE EXECUTION OF AOO; OS SEGMENT EXHAUSTED 



REGISTER DUMP 

NEKIT * lORG « FBEG = 00000 FBASE = 0020C 



KEL ORG LEN D/E IS FN NInT gp 

|_5; ^ — 4. 4. — 4, 4.- .^4. t-^"* --♦ 

I OU i COO I 100 i I I 1 I I 00 i 
I 000 I coo I 001 I 1 I Q I I 1 00 I 

— > I 



EFFECTIVE AOOR « 0000 IN OS 
TAG VALUE 



VS:+ ♦ 

I .. I 

I ST I 256 

I ST I 32 



OP VALUE LINK AUX 

I 00 I OP I ADO II I 

I — > I 

I 



~>l 



THE AOO INSTRUCriLN AT M(210) HAS BEEN FtTCHEDt DECCOEOf 
AND DEFERRED IN OS. SINCE BOTH OPERANDS ARE SCALARSt 
THE DEFERRED SEGMENT IS ACTIVATED IMMEOiATELV. {NOTE LSI 



REL ORG LEN D/E IS FN NMT OP 
lsj ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ * 

I on i 000 i 100 1 I I 1 t I CO I 

— > I 

EFFECTIVE ADOR - 0211 IN M 



TAG VALUE 
VS:* ♦ 

I .. I 

I ST I 268 
— >l 



OP VALUE 

-♦ QS:«- ♦ 

I — > I 



EXAMPLE 1-4: AFTER RETURN TC D-*'ACH1NE. RESULT OF AOO IS ON VS 



EXAMPLE 1-2: AFTER DECODING AOOj OPERATION DEFERRED IN OS 



EXAMPLE 2 - SCALAR OPERATURt ARRAY OPFRANOS 



REGISTER DUMP 

NEWIT - lORG « FREG = OOCOO FBASfc « 00200 

REL ORG LEN D/E IS FN NirtT UP 

LS: ♦- ♦ ♦- ♦ ♦ ♦ ♦ ♦ * 

I OLO I 000 I ICC I I I 1 I C I 00 i 

— > I 

EFFECTIVE ADOR ^ 0210 IN M 

TAG VALUE OP VALUE LINK AUX 

VS: ♦ ♦ ♦ US: ♦ ♦ ♦ ♦ ~-4. 

i •• I ••• I 00 1 IFA I olA I I 0111 I AA 

I SGT I SCOOECSEG.AtU 1 01 I IFA 1 36 I ) 0111 I 66 

I SGT I SC00E(SEG.8.n I ~> I 
— >l 

AKKAYS WITH DA • S AT 1000 AND 1010 ARt OF RANK 3 (NOTE viS AUX FIELDS). 

NEXT INSTRUCTION IS ADC AT M(210I 

EXAMPLE 2-1: BEFORE EXECUTING ADD 

REGISTER DUMP 

NEV^IT - lORG ' FREG = OOCOO FBASE = 0020C 

REL ORG LEN O/E IS FN NKT QP 

LS: 4- 4- ♦ 4- -f f -•► ••— ^ 

I Oil I 000 1 100 I I t 1 1 1 00 I 

~> I 

EFFECTIVE ADOR * 0211 IN M 

TAG VALUE OP VALUE LINK AUX 

VS: ♦ ♦ ♦ QS: ♦ 4- — ♦ 4- -♦ 

1 •• i ... I 00 I IFA I JA 1 I 0111 I C. 

i SGT i SCODEISEG.Cfl) i 01 1 IFA | a)B ) I 0111 | 

— >| 02 J OP I ADD I 02 I 0111 I _C 

--> I 

EXAMPLE 2-Zz AFTER DEFERRING ADD 
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(iii) It is a QS segment consisting of a scalar monadic operator operating 
on a beatable sub-segment. That is, it is of form: 



code for operand 



OP optype 1 R 

(iv) It is a QS segment consisting of a pair of beatable operands combined 
by a dyadic scalar operator^ One of these operands can optionally 
be a scalar value. The form is: 



code for right opnd 



code for left opnd 



OP optype k R 



(v) It is a pair of beatable operands combined by GDFo The form is 

similar to case (iv) above, 
(vi) It is a reduction of a beatable operand, in the form: 

BRED k 

'^ A 



code for reducee 



k: 



OP reduce- op 
SGV SEG.A 
S -length 
ITM 



- 100 



(vii) In addition to (i) through (vi) above, a single QS entry with opcode lA 
is beatable, although it does not enter into the recursive definitiono 

When a selection operation is interpreted by the D-machine, the array-valued 
operand is first checked for conformabilityo If the operand is beatable, then it 
is beaten, according to the transformations shown in Chapter m. Appendix A» In 
this process, if a DA to be transformed has a reference count of 1, indicating that 
it is a local temporary result, then the DA can be modified directly. If the reference 
count is greater than 1, then a copy must be made, and the copy is beaten. If the 
result of a beating operation is a scalar value, then the segment is turned over to 
the E- machine, which evaluates it and leaves the scalar result on the top of VS* 

When the operand of a selection operation is not beatable, there are two 
possible strategies to follow: In the case of the TRANS operation, there is no 
choice: the operand must be evaluated by the E- machine and a temporary value 
stored, which is then beaten as abovCo Otherwise, the selection operation can 
be treated as a special case of subscripting, in which case an appropriate set of 
E- machine instructions is dragged- along in QSo (See Section d, for an explanation 
of subscripting. ) The choice of strategies is a second-order design decision, 
and need not be made at this time, since either approach is viable. Example 3 
illustrates both beating of selection operators and drag-along of scalar operators. 

The DM code shown for the statement is a straightforward translation of the 
APL statement into Polish, Note that the vector 2, ""2 is a constant and is 
'Compiled" into the function segment. This approach avoids having to keep array- 
valued constants in the memory with other array quantities; to do so would require 
having an entry in NT for each such constant, and would complicate the storage 
management functions. In Examples 3-1 and 3-2, the state of the machine before 
executing the sample code is shown; the values of the variables M and N are not 
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EXAMPLE 3: DRAG- ALONG AND BEATING IN THE D- MA CHINE 



Consider the APL expression 



i?^(2,l)^(c})[l]M) + (2, 2)M 
At the time this is to be evaluated, pAf-«-^2,2 andpN^^s^^ . Assume that R 
has no current value. The machine code for this statement is shown as follows, 
starting at location 250 in memory. 
Addr Og Operand Comments 



250 


LDNF 


252 


LDCON 


254 


TAKE 


255 


LDNF 


257 


REV 


259 


ADD 


260 


LDJ 


262 


TRANS 


263 


LDN 


265 


ASGN 


266 


e o • 



N 
90 

M 



Refers to constant 2,~2with DA at 290 



(Recall 0-base in all machine code) 



JC0DE{2, 1, 1) This is the vector 2 , 1 



R 



Assign (and discard value) 



290 RC=1 LEN=4 

291 VB=0 AB=94 

292 RANK=1 

293 R(l)=2 D(l)=l 

294 RC=1 LEN=3 

295 2 

296 -2 



DA header 



DA for constant vector 2 ,~2 . 
See Section A for description 
of format. 



Header for value array 
Value 



- 102 - 



given, as they are irrelevant for this example. LS contains a descriptor for a 
D-machine segment of length 100, which is the main segment of the function F. 
The effective address is the sum of the REL field of LS and FBASE, the beginning 
of tiie value part of function F. VS contains a fimction mark for F which was 
placed there when F was called. 

In 3-3 and 3-4, the LDNF and LDCON instructions have been executed. Note 
that each caused the deferral of an IFA instruction (fetch array element in the E-machine) 
in QS. Also, for each deferred instruction, a QS segment descriptor was pushed 
to VSo The LDCON instruction allocated space and made a copy of the descriptor 
array for the constant which was in the fimction segment; the new DA is named Tl. 
The VBASE for the constant is 200, the same as the FBASE of the functiono 

The TAKE operation (3-5, 6) is evaluated by the DM using beating. The 
descriptor array T2 was created for the result, and was derived from the DA for 
N by the transformations listed in Chapter IE, Appendix A. It is easy to see that 
this DA is in fact the correct one. Also note that Tl is no longer needed, and has 
been erased. At this point, VS contains a segment descriptor which points to the 
QS segment describing the result of the computation to data, which is the evaluation 
of the subexpression ( 2 ," 2 ) iil/ • 

Examples 3-7 through 3-9 show the next LDNF instruction and the evaluation 
of the reversal operation by beating. The process in this case is similar to that 
for the TAKE, The ADD operation is deferred in 3-10 because both of its operands 
were array values. The LINK field of the ADD in QS is 2, referring to the operand 
2 elements earlier in QS, The top of VS now contains a descriptor for the entire 
subexpression in QS which has been evaluated at this point. The LDJ instruction 
(3-11) is executed similarly to LDNF and LDCON in that it defers a value in QS. 
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The TRANS instruction takes the transpose of the entire expression which 
has been dragged along so far« In this case, since its operand is a sum, the 
transpose is applied to both terms. Notice that although the deferred code in QS 
has not been altered (3-12), the DA*s which it references have been (3-13). The 
LDN R instruction pushes a value with tag NPT to VS (3-14) as the next instruction 
is an ASGN (3-15). This instruction notes that R was undefined (see NT,^ in 
Example 3-1) and allocates space for its DA and its value array. The space is 
allocated based on the knowledge of the size of the result deferred in QS. In 
3-15, we see the deferral of the assignment. The POP instruction in QS disposes 
of the value after it has been assigned (in deferring ASGNV, no POPS are used). 
In 3-16, the state of memory shows the new DA for R; also note that the address 
of the DA for R (@R) has been entered in NT by the ASGN evaluation. 

c. Other Operators (Executed Directly) 

The "other operators'* include all those APL primitives which cannot be 
deferred conveniently, or which are evalxiated immediately in the D- machine. 
BASE is in this class because it has a scalar result, while REP, GDU, GDD are 
included because ihey require rather complex calculations involving their entire 
operands simultaneously, which are impossible or difficult to do element-by-element. 
URHO is easily done by the D-machine, and so is not deferred, as is UIOTA, 
which produces a J-vector as result. The catenation operator, with operand K, 
is a direction to catenate the top K elements of VS to form a vector. This is 
done immediately (with the result being put in temporary space). The remainder 
of the operators in this class are dealt with differently, depending on the values 
of their operands. 
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EXAMPLE 3 - ORAG-ALGhG AND BEATING 
MEMORY DUMP 



ADOR CONTENTS 



AOOR CONTENTS 



NT: TAG CONTENTS 



iM 
♦01 
♦02 

♦ 03 
♦0<» 



HC»l LtN»C5 
V8=VH A8«0C0 

RANK=2 
Rtll«0C2 D< 1»«02 
R{2)«002 0<2)»0l 



iH 


RC«l LtN=05 


F 


FT 


•F 


♦ 01 


V6=VN A6-000 


M 


DT 


•H 


*oz 


RANK*2 


N 


UT 


dh 


+ 0i 


RU)«003 om*o4 


R 


UT 





♦ 04 


R<2»*004 0<2)=0l 









EXAMPLE 3-i: MEMORY BEFORE EXECuTiNG EXAMPLE CODE 



REGISTER DUMP 

hi\HlJ * lOKG = I FREG = OOCOO FBASfc = C0200 

REL ORG LEN 0/E IS FN NhT ^P 

LS: ♦ ♦ ♦ ¥ ♦■ fr ♦ + ♦■ 

I QbO I 000 I ICO I G I I 1 I I 00 I 
— > I 

EFFECTIVE AOOR * 0250 IN M 

TAG VALUE OP VALUE LINK AUX 
VS;+ ♦ ♦ QS:* ♦ ♦ ♦ 

I FMT I ♦FN MARK FOR F* | — > | 



EXAMPLE 3-2; REGISTERS BEFORE EXECUTING EXAMPLE CODE 



REGISTER DUMP 

NEhIT = lORG = I FREG = 00000 FBASE = 00200 

REL ORG LEN 0/fc IS FN Hhl OP 
LS: ♦■ ♦ ♦ — ■• ♦ «. ♦ + «. «. 

I 0^4 I COO I iwC I I I 1 I I CO I 



EFFECTIVE ADOR = 0254 IN M 
TAG VALUE 



OP VALUE 

:♦ + ♦ ijS:^ ♦ 

I FMT I ♦FN MARK FOR F* t 00 I IFA I diN 
I SGT I SCOOEISEG.Aill I 01 I IFA I olTl 

I SGT I SCOOE(SEG.B,ll I — > | 



1 0011 I AA 
I 0001 t Bb 



->l 



LDNF PUSHED QS<0;t AND VSd!) 
LOGON PUSHED OSfl:) AND VS(2;) 



EXAMPLE 3-3: AFTER LONF AND LOGON 



EXAMPLE 3 - DRAG-ALONG AND BEATING 
f*Et*OHf DUMP 
ADOR CONTENTS 



ADDR CCNTENTS 



AOOR CONTENTS 



iM RC=l LEN«0b 

♦01 VO=VM AB=000 

♦ 02 ftANK*2 

♦ 03 Rin»C02 0(11=02 
♦04 RI2I=002 0<2l=0l 



dN RC=2 LEN*05 

♦ 01 Va-VN At)«000 
♦02 RANK»l 

♦ 03 Ra)>CC3 Oin>04 

♦ 04 RI2M004 0C2l«0i 



411 RC*1 LEN«04 

♦01 VB=200 A8=094 
♦02 RANK*1 

♦03 R(l)*002 0(1)»01 



OA FOR N NOW HAS REFCO OF 2. Tl IS A COPY UF THE OA FUR THE VECTOR 2t-2 
EXAMPLE 3-4: MEMORY AFTER LOCGN 



REGISTER DUMP 

NEKIT » lORG = 1 FREG » 00000 FBASE * 00200 

REL ORG LtN O/E IS FN NMT JP 

LS; «. ♦ — «. ^ ^ -♦ «^— >-<• ♦ 

I 054 I 000 I 100 I I I 1 I I DC i 

— > I 

EFFECTIVE AOOR « 0254 IN M 

TAG VALUE OP VALUE 

VS:*- ♦ ♦ QS:^ ♦ ♦ 

i FMT I ♦FN MARK FUR F* | 00 I IFA I dl 2 I 

I SGT I SCOOEISEG.A.n I — > I 

— >l 



LINK AUX 
^ _♦ 

I 0011 I AA 



THE TAKE HAS ALTERED THE OA FOR Nt CREATING A NEM COPY. 
EXAMPLE 3-5: REGISTERS AFTER TAKE OPERATUR 



MEMORY DUMP 
AUDR CONTENTS 



AOOR CONTENTS 



AOOR CONTENTS 



dM 

♦ 01 
♦0 2 

♦ 03 

♦ 04 



«C=l LEN=05 

VB«VM AB«000 

PANK=2 
R<1>«002 Oil) =02 
R(2)»002 0(2M0l 



m RC*1 LEN=05 

♦01 V8«VN AB*000 

♦02 RANK=2 

♦ 03 HU»*003 D< 11=04 

♦04 R(2)«004 D<2J»01 



aT2 

♦ 01 
♦02 

♦ 03 



RC*l LEN»05 

VB'VN AB«002 

HANK«2 
RCU»002 Dni»04 



♦04 R(2I«002 0(2)>01 



THE NEW DA AT aT2 CONTAINS THE STORAGE ACCESS FUNCTION FOR THE 
TAKE OPERATION ON N« WHICH WAS PRODUCED BY BEATING. NOTE IN PAHTICULAR 
THAT THE VBASE OF T2 IS VN* WHICH POINTS TU THE VALUE ARRAY OF Nt AMO 
THAT THE DIMENSION OF T2 IS 2,2 t AS SPECIFIED BY THE TAKE OPERATOM* 
THE ABASE HAS CHANGED FROM TO 2, TO ACCOUNT FOR THE -2 ELEMENT IN THE 
PARAMETER 11. E. TAKE FROM THE ENOI. FINALLY, NOTE THAT THE VALUE OF 064. 
IN T2 IS THE SAHfc AS THAT FOR N. 

EXAMPLE 3-6: MEMORY AFTER TAKE OPERATOR 



EXAMPLE 3 - ORAG-ALONti ANO BEATING 



o 



REGISTER DUMP 

NENIT « I ORG = I FREG « OOOCC FBASE « 0020 



REL ORG LEN 0/E IS FN NkT QP 
LS: ♦ ♦ ♦ ♦ ♦ ♦ ♦ «• ♦ 

1 056 ) 000 I 100 I I 1 1 I I CO i 
— > I 



EFFECTIVE AOOR « 0256 
TAG VALUE 



IN M 



OP VALUE 

VS:+ + ■♦■ US:* ♦ 

1 FMT 1 *FN MARK FOR F* | 00 I IFA | dl 7 
I SCT I SCOOEtSEG.Atll I 01 I IFA I JM 

1 SGT I SCOOE<SEG.B,U I — > I 

— >l 



EXAMPLE 3-7: AFTER LONf M 



I OGll I AA 
I 0011 I Bb 



REGISTtR DUMP 

NEhIT = I ORG = I FREG = OOCOC FBASE = 0O2C0 



REL ORG LEN 0/E IS FK NnT QP 
LS: ♦ -♦ ■>— +— -♦ ♦■- — ♦- — ♦ ♦ 

1 058 I 000 1 100 1 I I 1 i I 00 I 
— > t 

EFFECTIVE AUDR = 0258 IN M 

TAG VALUE OP VALUE 
vs: ♦ -♦ ♦ gs: ♦ ♦ 

I FMT I *FN MARK FOR F* \ 00 I IFA j iT2 
I SGT 1 SCODECSEG.A.l) | 01 1 IFA 1 *T3 
I SGI I SCCOEISEG.e.l) I — > I 
— >l 

EXAMPLE 3-B: AFTER H^y 



LINK AUX 



j 0011 I AA 
I 0011 I bb 



EXAMPLE 3 - DRAG- ALONG ANO BEATING 
MCMORV DUMP 



AODR CONTENTS 



ADOR CONTENTS 



AOOR CONTENTS 



dM 
♦01 

♦ 02 

♦ 03 

♦ OA 



ftC«l LEN*05 

Ve»VM AB=000 

RANK-2 
R(l)*002 0(n«02 
R(2)=002 0121=01 



RC=l LEN-05 

VB'VN ABsO02 

RANKs2 
R(l)«002 D(lMO% 
R(2I»002 0(2)*0l 

RC«1 LEN«05 

VB»VM A8»002 

RANK*2 
R< 11=002 0UI«-2 
Ri2)«0C2 0121*01 

NOTICE THE NEW DA, 413 , WHICH CONTAINS THE ACCESS FUNCTION FOR THE 
REVERSAL ON M , THE PARTS WHICH mAVE CHANGED FROM THE DA AT iiM ARE 
ABASE, WHICH IS NOW 2, AND DELHI, WHICH IS -2 INSTEAU CF 2. THESE 
CHANGES ACCOUNT FOR THE REVERSAL OF M , ANALOGOUSLV TO THE WAV THE DA 
AT tfT2 ACCOUNTS FOR THE TAKE OPERATION ON N . 

tXAMPLE 3-9: AFTER REV 



IN 


RC=1 LfcN«05 


ilT2 


♦ 01 


VB^VN AB-000 


♦01 


♦ C2 


RANK'2 


♦02 


♦ 03 


Rin-003 01 l)>04 


♦ 03 


♦04 


R(2)*004 D«2»*0l 


♦04 

iT3 

♦01 
♦02 
♦ 03 

♦04 



REGISTER DUMP 
NEWIT ' lORG 



RtL 



ORG 



I FREG * OOCOO FBASE » 00200 

LEN D/fc IS FN NWT gP 



L^j 4 .♦ ^ ^ «. ♦• 4- 4. 1 

I 05V J 000 I ICC I ! I 1 I I OC i 
— > I 

EFFECTIVE ADOR = 0259 IN M 

TAG VALUE OP VALUE 

VSi^ ♦ — — ♦ QS:+ ♦ 

I FMt I •FN MARK FOR F* | 00 I IFA | alT2 
I SGT I SCUUECSEG.Cn I 01 I IFA | «»T3 
— >l 02 I OP I ADO 

— > I 



I I 0011 I C_ 

I I 001 I I 

I 02 I 0011 I _C 



EXAMPLE 3-10: AFTER ADO 



o 



EXAMPLE 3 - DRAG- ALONG AND B£ATING 



REGISTeft DUMP 

NEW IT « lOAG * 1 



FREG « OOCOO 



REL ORG LEN 0/E IS FN NkiT aP 

' 1 061 I 000 I toe I I I 1 1 I 00 
— > I 



EFFECTIVE AOOR « 0261 

TAG VALUE 
. 4 -♦- — 

I FNT I *FN MARK FOR F* 
t SGT i SCODEfSEG.C«l) 
1 SGT 1 SCUOEISEG.Otll 



IN n 



—>l 



EXAMPLE 3-111 AFTER LDJ 



OP VALUE 




LINK 


AUX 












1 00 1 IFA 1 ml2 




1 i 


0011 1 


1 c. 


1 01 1 IFA 1 4T3 




1 1 


0011 1 


1 


1 02 1 OP f ADO 




1 02 1 


0011 1 


1 -C 


03 1 IJ t JC00E(2i 


»ltU 


1 1 


0001 1 


1 DO 



— > I 



REGISTER DUMP 

NEW IT « lORG - I 



REL ORG 

LS* ♦ ♦ ♦ ♦ ♦ 

i 062 I 000 I IOC I I 
~> I 



FREG « OOCOO F8ASE ■ 00200 

LEN 0/E IS FN NtaT UP 

♦ ♦ ♦ ♦ ♦ — — ♦ 

1 1 I 00 I 



EFFECTIVE ADOR * 0262 IN M 

TAG VALUE OP VALUE 

VS«* ♦ ♦ OS: ♦ ♦ 

I FMT t •FN HARK FOR F« I 00 I IFA \ HZ 

1 SGT I SCOOEISEG.Ctl) I 01 I IFA | aT3 

— >| 02 I OP I ADO 

— > I 



EXAMPLE 3-123 REGISTERS AFTER TRANS 



LI MR AUX 

.i,»4.....4 ♦ 

1 I 0011 I C_ 
I I 0011 I 
I 02 I 0011 I .C 



EXAMPLE 3 - ORAG-ALCNG AND BEATING 
HEHORV DUMP 



ADOR CONTENTS 



AOOR CONTENTS 



AOOR CONTENTS 



aM 

♦01 

♦02 

♦03 
♦04 



RC«l LEN>05 
VB'VM AB*000 

RANK- 2 
iiin«002 Oll)x02 
R<2I«002 0<2>»0l 



RC*l LEN«OS 

VB-VN AB»002 

RANK«2 
R( 11*002 OUI«01 
R«?)«002 0(2I«06 

RC«1 LEN-05 

VB«VP AB-002 

RANK-2 
R(1}*D02 0111*01 
R<2I«002 0(2M«2 

THE EFFECT OF THE TRANSPUSE MAS TO ALTER iHt OA*S AT aT2 AND aT3. 
THE CHANGE IN BOTH CASES ^AS TO INTERCHANGE R(l> MITH R(2}« ANU 
Oil) MITH D(2). IT SHOULU BE INTUITIVELY CLEAR THAT THESE OA*S MILL 
NOW ACCESS THE TRANSPOSES OF THEIR PREVIOUS VALUES. 

EXAMPLE 3-13: MEMORY AFTER TRANS (NOTE ALTERED DA* SI 



aN 


RC-l L€N«05 


aT2 


♦ 01 


VB»VN AB-000 


♦01 


♦02 


RANK«2 


♦02 


♦ 03 


RI1)«003 on)«06 


♦03 


♦04 


R(2)«004 0(2)«01 


♦ 04 

aT3 

♦01 
♦02 
♦03 
♦04 



REGISTER DUMP 

KEWir « lORG > 1 



FREG 



OOCOO 



FBASE « 00200 



REL ORG LEN D/E IS FN NI«T QP 
LS« ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ 

I C64 I 000 I IOC I I C I 1 I I 00 I 
— > I 



EFFECTIVE AUDR 
TAG VALUE 



0264 



IN M 



OP VALU 

VS:^ ♦— ~ ♦ US:^- ♦-.—-. 

I FMT I «FN MARK FOR F* | 00 t IFA i aT2 
I SGT I SCOOEISEG.Ctl) I 91 i IFA \ aT3 

t NPT I R I 02 I UP I ADO 

— >| — > I 



I I 0011 I C. 

I I 0011 I 

I 02 I 0011 i .C 



EXAMPLE 3-14: AFTER LDN K 



EXAHPLE 3 - CXRAG-ALCMG^ANO^BEATING 

I FREG « 00000 



RFGCSTER DUMP 
KEWIT « lORG 



F8ASE = 00200 



RfcL ORG LEN D/E IS FH N^J QP 
j^S: 4. — « — 4, .4, 4. 4. 4 4. 4.- 4. 

I C63 I 000 t ICO t I I 1 t 1 00 I 
— > I 



EFFECTIVE ADOR = 0265 

TAG VALUE 
V S : ♦ ♦ 

I FNT I *FN MARK FOR F* 
1 SGT I SCODE(SEG.EtU 
— >l 



IN M 



OP 


VALUE 


♦ QS:4. ♦ 


1 00 1 IFA 


dT2 


1 01 1 IFA 


aiT3 


02 1 OP 


ADD 


03 1 IFA 


^R 


04 1 OP 


ASGN 


05 1 POP 






LINK AUX 

1 OOU 

t OOU 

02 I 0011 

I OOil 

02 1 0011 

I OOU 



— > I 



E_ 



_fc 



EXAMPLE 3-15: REGISTERS AFTER ASGN 



S 



MEMORY DUMP 



ADOR CONTENTS 



AODR CONTENTS 



NT: TAG CONTENTS 



dM HC=: LEN=05 aT2 

♦01 VB=VM AB*000 ♦Ol 

♦02 RANK=2 ^02 

♦03 Ra)*002 D(1)«0? ^03 

♦04 R<2)=002 0(2**01 ^04 

*N RC-1 LEN=C5 iJT3 

♦01 VB=VN AB=0O0 +01 

♦02 RANK-2 ^0 2 

♦03 RU) = 003 0(11=04 ^03 

♦04 R(2)»004 D(2)-0l ^04 

SR RC=l LEN«05 

♦01 VB^VR Ae»000 

♦02 RANK-2 

♦03 R(n-002 0(11*02 

♦04 R(2>«002 0(2)«0I 



RC«l LEN-05 
VB«VN A3*002 

RANKs2 
R(l)=002 D( 11*01 
R(2I=002 0(21=04 

RC=l LEN=05 
VB=VM Att=002 

RANK=2 
R(l)*002 D( 11=01 
R(2)«002 0(2J=-2 



F 
M 
N 
R 



Fl 

OT 
OT 
DT 



«iF 

a)M 

dN 

«R 



EXAMPLE 3-16: MEMORY AFTER ASGN 



RAV and DRHO are difQctilt to defer in general because of the complex 
calcidations necessary to access an arbitrary element of the restdt. However, 
there are special cases which are easy to defer, as follows: 

(i) The right operand is a scalar or single- element quantity. The RAV 
of such a value is a J-vector if it is an integer, or at worst is an 
explicit one-element vectoro Similarly, the DRHO of such a value 
is deferred in QS as follows: 

S value 

IRD Tl R 

where @T1 is a DA for the result and R is the encoding of the rank. 
The IRD instruction is essentially a note to the D- machine that the 
result has dimension described in Tl. 
(ii) The right operand B is an expression deferred in the form of (i) above. 
In this case, all that has to be done is change the descriptor array 
@T1. 
(iii) The right operand is of the form 

IFA @W R 
and @ W points to a DA which has not been altered by any select 
operations which upset the ordering of the value part. That is, if 
W is the array specified by @W and V is the vector containing the 
value part, then wi ;/L]-«->c[(pC)iL] for all appropriate values ofi^. 
In this case, RAV is evaluated by providing a new DA with rank 1 and 
dimension >< /pf/ . DRHO can be deferred if ^/pA , where A is the 
left operand of the DRHO, is less than or equal to x /pC also by 
providing a new DA with dimension A. 
If none of the above apply, then RAV and DRHO are evaluated immediately by 
creating temporary values in M. 
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d» Other Operators and Compound Operators (Deferrable) 
The D-machine evaluates this subclass of operator instructions by deferring 
E-machine code in QSo The expansions are detailed in Appendix C and should be 
easy to understand with a knowledge of the way the E-machine works* We will 
here discuss only the SUBS instruction and the compound operators, as their 
behavior is somewhat more complex* 

The SUBS K operation corresponds to the symbol C in an APL program. 
When decoded, it expects the top of VS to contain a QS segment descriptor for a 
rank-K quantity and the next K entries on VS to be either scalars or QS segment 
descriptors for the subscript expressions* An empty subscript position is created 
by the LDSEG instruction with its operand a segment descriptor SCODE(0, 0, 0) of 
ler^h Oo 

There are two important cases to consider: 

(i) If the subscriptee is beatable, then the subscript expressions are 
examined in turn, starting from the rightmost (deepest in VS) to 
find scalars or J- vectors* If found for, say, the I— coordinate, 
the equivalent of INX I with that operand is performed on the sub- 
scriptee by beating, causing new DA's to be created for it. The VS 
entry for this subscript is then deleted if it was a scalar* If it was 
a J-vector, then the VS entry is changed to the empty segment and 
the QS entry is deleted by moving all of QS down 1 to fill in the space 
(with appropriate adjustments to descriptors)* If, after all subscripts 
have been examined it is found that the remaining stacked subscripts 
are either empty or non-existent, then the result already exists, in 
standard form, in QS* In this case, the remaining empty segment 
descriptors are removed from VS and the result is the QS descriptor 
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at the top of VS, Otherwise, the remaining subscripts are treated 
as in the second case, described in the next paragraph, 
(ii) If there are explicit non- scalar or non- J- vector subscript expressions 
and/or the subscriptee is not beatable, then the subscripts must be 
dragged along in QS. This is done by creating temporary index ac- 
cumulators (opcode XT) in QS and generating E-machine code to 
activate the necessary subscript evaluations at the right times. If 
the subscriptee is a reduction, QS is transformed according to the 
transformation (OP/A) [^] — > OP/A^ ;] and evaluation continues 
as above. The details of the subscript expansion are shown in 
Appendix C. Example 4 illustrates the process which has just been 
described. 
In evaluating a GDF, the machine first examines the operands. If they contain 
deferred operators, then they are evaluated to temporary space first. This is 
done to avoid imnecessary recalctilation of subexpressions necessary to compute 
a GDF. It also guarantees the possibility of applying SF transforms to GDF ex- 
pressions by beating. Then all that is necessary is to alter the access masks in 
the AUX fields of the deferred left operand in QS to provide the proper access 
method for the E-machine. This is illustrated in Example 5 below. If the GDF 
reduces to a simple case, e. g. , if one of the operands is a scalar, then the ex- 
pression is treated as a normal scalar operator expression (see part a above). 

Efficient evaluation of reductions along coordinate K of the reducee R (in the 
E-machine) depend on transformation TRll (see Chapter II) which allows permu- 
tation of the reduction coordinate by transposing the reducee. In evaluating a 
REDUCE along coordinate K the reducee is first checked to see if it fits into one 
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of the special cases of reduction: 

(i) Empty reduction coordinatOo The result is then an array with value 
(iKy!^ippR)/QR)pIDENT where R is. the reducee andlDENT is the 
identity element for the reduction operator, 
(ii) Reduction coordinate of length 1, The result is theni?[[i^] TDRrrl 

If reducee is a scalar, the result is B\ 
(iii) Reducee is a vectoro In this case, the reduction is activated im- 
mediately in the E- machine, since the result is a scalar o 
If none of the special cases is satisfied, the reduction is deferred by first doing 
the transpose of TRll if necessary, and generating the deferred code in QS as 
shown in Appendix C. 

EXAMPLE 4: SUBSCRIPTING IN D-MACHINE 

Consider the APli expression i4[i4;;2;7] where A is a rank-4 array with 
p^^^5 > 4^ , 6 , 3 and 7^-^3 ,2,1,2 , with the index origin lORG <-^ 1 . The D- machine 
for evaluating this expression is 

250 LDNF V Vector V 



252 


LDS 


2 




Scalar 2 


254 


LDREG 


SCODE(0, 


0,0) 


Empty subscript 


256 


LDS 


4 




Scalar 4 


258 


UIOTA 






Gives t4 


259 


LDNF 


A 




Array A 


261 


SUBS 


4 




Do the subscript, expected operand ranlc is 4 


263 


• o • 








The followii^ memory and register dumps show the steps the D- machine goes through 


to evaluate this expression. 
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EXAMPLE 4 - SUBSCRIPTING IN 0-MACHINE 
Mtl»ORV DUMP 



EXAMPLE 4 - SUBSCRIPTING IN C-MACHINE 



ADOR CONTENTS 



ADDS CONTENTS 



NT: TAG CLINTENTS 



iA 

♦ 01 
♦0 2 

♦ 03 

♦ 0<» 

♦ 05 

♦ 06 



RC«l LEN=07 

VQ=V1 A&*OfO 

RANK*4 

Rm = 005 om»72 
R<21<00<» 0I2I>18 
R<3»»006 Om=03 



iV RC*l LfcN=04 

♦CI V&=VV AB=000 

♦02 RANK«I 

♦ 03 RU)«004 01 U*Ol 



DT 

OT 



EXAMPLE 4-1 : MEMORY BEFORE EXECUTING EXAMPLE CODE 



REGISTER DUMP 
NEtalT > lORG 



REL ORG 
«. 

061 I COO 



LS: ♦ 
--> I 

EFFECTIVE AUOR = 0261 
TAG VALUE 



FREG = OOCOO F 

LEN D/E IS FN NWT wP 
1 4. # 4. ♦ ♦ 

IOC I I I I i I 00 I 



IN M 



VS:+- 



1 



I SGT I SCOOEtSEG.A.n 

I ST I 2 

I SGT I SCOOE(SEG.NILtO) 

I SGT I SCUDEISEG.Btl) 

I SGT I SCOOEISEG.C.U 



QP VALUE 
QS:* ♦ 

00 I IFA ! av 

01 I IJ I JCUDEKtliO) 

02 i IFA i alA 
— > I 



LINK AUX 



i 0001 I AA 
I 0001 I BB 

) nil I cc 



VS CONTENTS ARE THE SUBSCRIPTS AND SUBSCRlPTEE. NOTE THE ACCESS MASKS 
IN THE AUX FIELD OF QS . THEY INDICATE THAT V AND THE J-VECTOR AKt 
VECTURSt AND A IS A HANK-4 ARRAY. 

EXAMPLE 4-2: AFTER ALL BUT THE SUBS OPERATOR 



REGISTER DUMP 
NEHIT » lORG 



I FREG » OOCOO 
LEN 0/E IS FN NtaT QP 



REL ORG 
LS: ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ 

I 063 I 000 t IOC I I I I I I GO 
— > I 

EFFECTIVE AODR = 0263 IN M 



TAG 
VS:* 

t .. 
I SGT 



i SCOUEISEG.Dtl) 



♦ QS:^ 
1 00 1 


JMP 1 







1 06 1 


-.—..4 


0- 


1 01 1 


IFA 1 


iV 




i 1 


0001 1 


EE 


C2 t 


IFA 1 


•Tl 




1 1 


OIU 1 


FF 


03 1 


XT 1 


XC ODE 10. 3 


1) 


1 03 1 


1 




04 1 


XT 1 


XCODE<0t3 


U 


1 1 


1 




05 1 


XT 1 


XCODE«0,2 


1) 


i 1 


1 




06 1 


I XL 1 







1 1 


0100 1 




07 1 


XS 1 







1 04 1 


1 




08 1 


I XL 1 







i 1 


0010 1 




09 1 


XS 1 







1 05 1 


1 




10 1 


ISC 1 


SCUOEISfcG 


.E.U 


1 ) 


0001 i 




11 i 


XS 1 







1 06 1 


1 




12 1 


SG 1 


SCGDEISEG 


F.l) 


1 09 1 






13 1 


IRQ 1 


• T2 




1 i 


0111 1 


_0 



— > I 

VS AND QS HAVE BEEN TRANSFORMED BY THE SUBS OPERATION. THE SCALAR 
SUBSCRIPT REDUCED THE RANK OF A BY It AND THE INTERVAL VECTOR 
SHORTENED THE FIRST COORDINATE (SEE DA AT #T I I . THE REST OF THE 
CCDE GENERATED IN tiS IS FOR CALCULATING EXPLICIT SUBSCRIPT VALUES. 
WHICH ARE KEPT IN THE XI ENTRIES. THESE ENTRIES CONSTITUTE A 
PSEUDO-ITERATION STACK, (SEE SECTION E) 

EXA>1PLE 4-3: REGISTERS AFTER SUBS 



MEMOSY DUMP 
ADDR CONTENTS 



ADDR CONTENTS 



ADOR CONTENTS 



A 


RC»l LFN»07 


av 


RC«2 LEN=04 


412 


RC*1 L£N«06 


♦01 


VB>V1 AB=0C0 


♦ Oi 


VQ«VV AB'OOC 


♦01 


VB« AB-COO 


♦0 2 


RANK«4 


♦ 02 


RANK»1 


♦ 02 


RANK»3 


♦ 03 


Rin*005 OIU=72 


♦ 03 


R*n=004 OU»«0l 


♦ 03 


R(n«oo4 otn»u 


♦ 04 


R(2I=004 0121-18 






♦04 


R(2)*0C4 D(21*04 


♦ 05 


R(3»>006 0<3I=03 


•Tl 


RC*1 LEN»06 


♦ 05 


RI3)*004 0I3)«01 


♦ 06 


R<4|»003 D(4»»01 


♦ 01 

♦ 02 

♦ 03 

♦ 04 

♦ C5 


VB»VA AB»003 

RANK- 3 
Rin«004 Dlll«72 
R<i»-004 OI2l>ie 
R43)>003 U(3)>01 







EXAMPLE 4-4: MEMORY AFTER SUBS 



EXAMPLE 5: GDF IN D- MACHINE 

In the example expression, Mo .xN ^ both M and N are matrices with pM<-^4 ,3 
and N-^^p3,2 o D" machine code for this expression is 
250 LDNF N 

252 LDNF M 

254 GDF MUL Do GDF 

256 o.o 

Examples 5-1,2 show the machine state before evaluating this code. In 5-3, the 
GDF operation has been deferred in QS» Notice that the access mask of M 
in the AUX field of QS has been changed. The IRD entry, whose operand DA gives 
the dimension of the result, contains 1111 in its AUX field, which instructs the 
EM to use a 4- level iteration stack to evaluate the expressiono The 1100 AUX for 
M says that M- indices come from the two highest iterations, while the 0011 AUX 
for N indicates that N is to use the two lowest. 

An equivalent formulation of the contents of QS at this point is that it represents 
the GDF in the form: 

for I := step 1 until 3 do 

for J := step 1 until 2 do 

for K := step 1 until 2 do 

for L := step 1 until 1 do 

RESULT [l;J;K;L] := M[I;J] xn[K;L]; 
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iXAWLE 5 - COf IN I)-«ACMINE 



REG I STEM DUMP 

NENIT « lORG « 1 FREG » OOCOO FBASE • 00200 



REL ORG LEM D/E IS FN NMT Q? 
4_S: ♦ — — ♦ ♦ ♦ *■ ♦ ♦ ♦ ♦ 

I 054 I 000 I ICC t I t I I I 00 ( 

-> I 

EFFECTIVE ADDR > 0254 IN H 

TAG VALUE 0I» VALUE LINK AUX 

VS:* ♦ ♦ USs* ♦ ♦ ♦ ♦ 

I •• I ... I 00 I I FA I AM t t OOU I AA 

I SGT I SCOOEISEG.AtU 1 01 I IFA I #M i I 0011 I B8 

i SGT I SCOOEISEG.Bill I —> I 
— >l 



EXAMPLE :> - GDF IN 0-MACHINE 

REGISTER DUMP 

NEMIT « lURG « 1 FREG * 00000 F»ASE 

REL ORG LEN 0/E IS FN NUT OP 

LS: ♦ -♦— ♦ — -—.♦—♦-—♦—♦_ — «. « 

I 056 I 000 I 100 I I I 1 I I 00 I 
-> I 

EFFECTIVE AODR - 02S6 IN M 

TAG VALUE OP VALUE 

VSi* ♦ ♦ QS»* ♦ 

I .. i ... I 00 I IFA I dN 

i SGT I SCUOEISEG.CtU I 01 | IFA | AM 

— >l 02 I GOP I MUL 

03 I IRO I 4T1 

"> I 



LINK AUX 






0011 I C. 
1100 I 
lilt I 

nil I .C 



I 



EXAMPLE 5-1 ( REGISTERS BEFORE GOF 



MEMORY DUMP 
AOOR CONTENTS 



AODR CONTENTS 



dM 
♦01 
♦02 
♦03 



RC>1 L£N«05 
V(I*VM AB>000 

RANK*2 
Rlll»004 OC1)>03 



♦04 Rf2>*003 0(2I«01 



m RC«1 LEN«05 

♦01 VB*VN AB'OOO 

♦02 RANK«2 

♦03 Hin«003 Otl)«02 

♦04 R{2)«002 Dt2l»01 



EXAMPLE 5-2 « MEMORY BEFORE 60F 



EXAMPLE 5-3: AFTER 60F - NOTE CHANGED AUX FIELDS IN QS 



MEMORV DUMP 



AOOR CONTENTS 



RC»2 LEN-05 

VB»VM AB«00Q 

RANK»2 
RCll-004 DUI-03 
RC2I«003 OI2l«0I 



♦01 
♦02 
♦03 
♦04 



AOOR CONTENTS 

— — ♦ — - — — — — .— 

m RC»2 LEN»05 

VB«VN AB«000 

RANK«2 
R(l}«003 O4 1l«02 
RI2)«002 0f2)>01 



AOOR CONTENTS 



♦01 
♦02 

♦ 03 

♦ 04 



ATI 
♦01 
♦02 
♦03 
♦04 
♦05 
♦04 



RC*1 LEN»OT 

VfMi Aa>000 

RAfM(*4 
Rll)«004 Olll-IB 
Rf2»«003 0f2}*06 
R(3I«003 Oi3)«02 
RUI-002 0I4I-01 



aTl HAS CREATED SINPLV TO RECORD THE RANK AND DIMENSION VECTOR OF 
THE RESULT OF OOiNG THE OUTER PRODUCT. THE OPCODE IRO f IN QSOlll 
SIGNIFIES THAT ITS OPERAND DA IS DESCRIPTIVE* AND IS NOT TO BE 
EXtCUTEO. IN THE E-MACHlNEt IRO IS IGNORED. 



EXAMPLE 5-41 MEMORV AFTER GDF 



E. The E"Machine 

The E-machine is a simple stack- oriented computer which evaluates array- 
valued expressions by iterating element- by- element over their index setSo The 
EM takes its instructions from the instruction buffer (QS), where they were put 
by the D-machine, Other machine registers are used in the same way as in the DMo 

The central task of the EM is to access individual array elements in computing 
array- valued expressionSo As most of the complexity of the E-machine is related 
to this task, we first discuss the accessing mechanisms in the EM. Given this, 
it is a simple matter to explain the instruction set of the machinco 
lo Array Accessing 

a* Indexing Environment 

Array reference instructions are entered in QS in the form 

IFA @VAR MASK 
where @VAR is the address of a DA in M, and MASK is a logical access maskp 
When such an instruction is first entered in QS by the D-machine, it is done without 
regard to its context in the input expressiouo The E-machine must, in order to 
evaluate it, determine its context, which takes the form of an indexing environment 
for an array referencCo The indexing environment of an instruction in QS is 
determined by how the segment containing the instruction was activated, which in 
turn relates to the form of the original expression input to the D-machine, 

(i) If the QP field of the top of LS is zero, then the environment is simple , 
and array references within this segment are based directly on the 
iteration stack, A simple environment arises in variables not affected by 
explicit subscripting or which are not operands in expressions which cause 
expansions to be made by the DM. For example, in the statement A-<-B+C^ 
all variables have simple environment. 
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(ii) If the QP field of LS is non-zero, then the environment is complex , and 
array references in this segment are controlled by a pseudo-iteration 
stacko In the statement A^B-tCLViWl , A and B will have simple environ- 
ments, but C will be complex as the reference to C is embedded in a 
segment resulting from the expansion of the subscript operator. Note 
that this concept is recursive« For example, we can also say that the 
environment of the subexpression ClViWl is simple. This recursiveness 
allows arbitrary levels of subscript nesting to be handled by the drag- 
along scheme of the D-machinCo 

The segment containing the IFA @C instruction is activated in the 
EM by an SG instruction referring to a sequence of entries in QS of the 
form: 

XT XCODE(a, ml, cl) 
XT XCODE(b, m2, c2) « 
Here, a and b are indices for C calculated from the subscripts V and W 
by the expanded subscript code in QSo These quantities are, in txirn, 
computed from the current values in IS. ml and m2 are the maximum 
permissible values of a and b derived frompC, and cl and c2 are change 
flagSo Thus, these XT entries correspond to the CNT, MAX, and CH 
fields of the iteration stack, and are therefore called a pseudo-iteration 
stack (pseudo-IS)o 

be Initialization of Access Instructions 

Each array accessing instruction must be bound to its indexing environment 
when first executedo This process is described below for IFA instructions and 
is analogous for lA and IJ. 
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(i) Determine index sources 

The encoded access mask in the AUK field of an instruction is used 
to determine its indexing environmento For example, if the environment 
is simple and the bit pattern in AUX is 0101 and the IS is four deep, then 
the index sources are determined by (0, 1, 0, l)/0, 1,2,3 which is the vector 
1, 3o Call this vector INXo Had the QP field of LS indicated a complex 
indexing environment, then INX would have been based on the length of the 
pseudo-IS rather than on the length of ISo 
(ii) Set up iteration control block 

An iteration control block (ICB) is established at the top of QS, 
containing the coefficients of the storage mapping function from the DA 
for the array (DEL) and the INX vector, calculated abovOo An ICB contains 
one word for each coordinate of the array being accessed, as shown belowo 
The fields marked Ql and Q2 are both encoded into the VALUE field of 
QS using the function QCODE (see Appendix A)o The contents of the I— 
ICB entry are: 

field contents 

OP if simple environment then NT else QT 

LINK INX [I] 

AUX 

Q2 DEL[l] 

Ql if simple environment then DEL [l] x (MAX field of IS 

entry selected by LINK field) else 
In addition, the last entry in an ICB is given opcode NLT or QLT, depending 
on its environmento 
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(iii) Initialize QS entry 

The Ql fields of the ICB just established are added to the ABASE 
foirnd in the array's descriptor array to produce the sum So VBASE is 
also fetched from the DA, and the DA is ^^erased'^ from QS by subtracting 
1 from its reference counts The original IFA entry is then replaced by 

FA QCODE(VBASE, S) DPTR 
where IPTR is a pointer to the beginning of the ICB for this arrayo 
This completes the initialization of array referenceSo In effect, what has 
been done is to replace the context- independent reference created by the D-machine, 
by information which binds the reference to its indexing environment, and which 
contains all information necessary to access the array (in the ICB)* 
Co The Index Unit 

^^^ index imit (lU) is invoked by the E-machine every time it executes an 
array-access instruction that has been initialized as above (ioCo, FA, A, J)o 
Using the information in the instruction, its ICB, and IS or a pseudo-IS, the lU 
accesses the appropriate array element and pushes it to VSo The lU functions 
differently, depending on the indexing environment: 
(i) Simple environment 

In this case, we know a priori that the elements of the array will 
be accessed in a simple order, determined by the way IS is cycled, and this 
information can be used to minimize the re-computation of the storage 
mapping fxmction for each element of the array* The lU looks at the 
iteration stack entries for this array (specified in the ICB), starting at 
the right-most coordinatCo If the IS entry has changed (noted by CH bit) 
but not recycled, then the IS adds the DEL component from the ICB to S; 
if there was a change and a recycle, the Ql field is subtracted from S. 
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The new S value is stored back in the instructiono This process continues 
until an IS entry with no changes is found, in which case none of the 
higher IS entries contain changes eithero If the iteration is going backwards, 
as in a reduce, then addition and subtraction are interchangedo 
(ii) Complex environment 

In the complex case, there is no way of predicting in advance how the 
indices will proceed and each change requires an explicit evaluation of 
part of the mapping functiouo This is done similarly to the simple case, 
by examining the pseudo-IS for each coordinate of the array* If a change 
is recorded (in the X3 part of the XT entry) then the new index (XI part) is 
multiplied by DEIjo This resxilt is added to S and the Ql field of the ICB is 
subtracted from S with the new S stored back in QS« Finally, the product 
just found is stored in the Ql part of the ICB. This field thus records 
partial values of the mapping polynomials 
The behavior of the machine in array accessing, as described above, is 
illustrated in Example 60 
2o Instruction Set 

Instructions in the E-machine can be considered in three groups: 
a. Simple instructions 
bo Control instructions 

Co Micro- instructions, used primarily for maintaining pseudo- iteration stackSo 
In addition, as seen in the previous section, the instructions buffer contains entries 
for pseudo-iteration stacks (opcode XT) and iteration control blocks (NT,QT,NLT,QLT)o 
Table 3 summarizes the E-machine repertoire, and Appendix B contains a detailed 
algorithmic description of the E-machine' s behavior. The remainder of this section 
discusses these instructions in both fimctional and '^programming' ' terms. 
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ao Simple instructions 

The S instruction, Load Scalar, pushes its value to VS with tag STo IFA 
fetches an array element according to its operand DA and the indexing environment, 
and pushes it to VS with tag ST; similarly, IJ pushes an element of a J- vector to 
VS, while lA pushes an address of an array element (tag AT)o These instructions 
can be considered simply at the programming level, as just described, although 
the mechanism which they invoke is much more complex, as was seen in the previous 
sectiouo 

The instructions OP and GOP have as operands the names of arithmetic 
functions in the EM (monadic or dyadic)o Executing an OP or GOP invokes the 
named function, which operates on the top of VS, deleting the operands and pushing 
the result, with tag ST. (This process is illustrated in Example !» ) NIL is a 
No-op, and does nothingo Recall from Section D and Appendix C that IKD and IRP 
are generated by the D-machine to keep track of intermediate results in doing 
drag-along. As they have no use in the E-machine, they are changed to NIL when 
first executed. 

bo Control instructions 

The main control instructions are SGV and SG, whose operands are QS 
segment descriptors. SGV pushes this descriptor to VS (with tag SGT) and is thus 
analogous to LDSEG in the DM. SG activates the named segment by pushing an 
entry to LS; in this instruction, the LINK field is significant, in that it can change 
the indexing environment. JMP, JO, Jl, JNO, and JNl are simply relative jumps 
within QS; RED is also a relative jump, but in addition, it pushes to VS an entry 
with tag RT, to be used as an accumulator for a reductiouo (RED is generated by 
the DM only in conjunction with reductions. ) 
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MIT is used primarily to activate reduction segmentSo It takes ST entries 
from the top of VS and uses them to push new iterations to IS* When the MIT 
execution reaches an SGT entry on the top of VS, the referenced segment is activated 
by pushing the descriptor information to LSo (See Appendix C for a description 
of how reduction segments are deferred in QS« ) 

c. Micro-instructions 

The set of micro- instructions are used by the E- machine to maintain pseudo- 
iteration stacks in QSo They result from D- machine expansions of subscripting 
and related operations*, The micro- instructions are fully explained in Table 3-C, 
and the DM expansions in Appendix C illustrate their use. 



TABLE 3 
E-Machine Instruction Set 

Notes: 

ao Each instruction is in the form 

OP VALUE LINK AUX . 
In the discussion, K is the address of the instruction in QS. 

bo Instructions starting with the letter ^1" are ^^xminitializedo '^ That is, they 
have not yet been boxmd to their indexing environments a They are changed to 
similar instructions without the leading ^V^ when first executed. 
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Operation 



Name 



TABLE 3-A 
E-Machine — Simple Instructions 

Definition 



IFA 
FA 



lA 
A 



IJ 
J 



OP 
GOP 



NIL 

IRD 

IRP 



Load Scalar 

Load Array 
Element 



Load Array 
Address 



Load 
J-Vector 
Element 



Scalar 
Operator 



No Operation 

Result 
Dimension 



Push VALUE to VS, with tag ST, 
IFA causes initialization, as described in 
Section E« 1. Bo , and the instruction becomes 
FA. FA fetches an array element determined 
by the indexing environment and pushes the 
value to VS with tag ST. 
lA causes initialization and the instruction 
becomes A. A is similar to FA except that 
the (encoded) address of the selected element 
is pushed to VS with tag AT. 
IJ is similar to IFA, and becomes J after 
initializationp The VALUE field is an encoded 
descriptor of a J- vector, the correct element 
of which is computed and pushed to VS with 
tag ^o 

The VALUE field is the name of a scalar 
arithmetic operator. This is invoked and 
takes its operands from the top of VS, leaving 
a result there after deleting the operands. 
No operation. 

These instructions are used by the D- machine 
and are left in QS when a segment is turned 
over to the E-machine. SSlnce they are of no 
use to the EM, they are changed to NIL the 
first time encountered. 
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Operation 



TABLE 3-B 
E-Machine — Control Instructions 
Name Definition 



SGV 



SG 



Load Segment 
Descriptor 



Activate 
Segment 



JMP 


Jump 


JO 


Jump if 


Jl 


Jump if 1 


JNO 


Jump if 




nondestructive 


JNl 


Jump if 1 




nondestructive 


RED 


Begin 




Reduction 


MIT 


Mark and 




Iterate 



The VALUE field is a QS segment descriptor, with 
addresses relative to Ko Make these addresses ab- 
solute and push the descriptor to VS with tag SGTo 
The VALUE field is as in SGV, and LINK, if non- 
zero, points to a pseudo- iteration stack in QS. 
Activate the segment by pushing an entry to LS, 
using the LINK information to alter the QP field of 
LS if necessary. 

Potential jump destination is K+LINK, where LINK 
is considered as a signed number. JMP is uncondi- 
tional. 

The others are conditional on the value on top of 
VS. JO and Jl also pop VS. 

Push an element with tag RT to VS to act as a re- 
duction accumulator, and jump to K+LINK. 
Scalar values on top of VS are used to start a new 
iteration nest in IS. The absolute value of the VS 
value, less 1, is the MAX field in IS; the iteration 
direction (DIR) is forward (0)if VS is positive, 
otherwise backward (1). The CNT field of IS is 
initialized to or MAX, depending on whether DIR 
is or 1. Moreover, the first entry in IS has its 
MRK bit set to 1; all others are 0. Each VS value 
is popped. Finally, when an SGT entry is found it is 
popped and the named segment is activated in LS. 
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TABLE 3-C 



E-Machine — Micro-Instructions 



Operation Name 



Definition 



POP 
DUP 

ORG 
CY 

LVE 



Pop 
Duplicate 

Load lORG 
Cycle 

Leave 



RPT 


Repeat 


CAS 


Case 


VXC 


Excha.nge 


LXl 


Load from 


LX2 


Pseudo-IS 


SXl 


Store in 


SX2 


Pseudo-IS 


DCL 


Lidex load 


XL 




XS 


Index Store 



xc 

ISC 

sc 



Index Change 

Activate 
Segment 
Conditional 



Pop top element of VSo 

Fetch the VS entry, LINK elements from top of VS, and 
push it to VSc (Does not disturb original copyo ) 

Push current value of lORG register to VS (tag ST). 

Step IS and repeat the current segment if IS hasn^t 
overflowed. 

De-activate the current segment, erasing any associated 
IS entries. 

Repeat current segment from beginning. (Does not affect IS. ) 

If top of VS is not an integer scalar, then error else if the 
value is N, then pop VS and execute the instruction at K+N 
and resume execution at K+LINK. 

Interchange top two entries on VS. 

LINK fields are relative pointers to XT entries. Push XI 
(or X2) field of referenced entry to VS, tag ST. 

acre top (ST) entry on VS in XI (or X2) field of referenced 
XT entry. Pop VS. 

IXL is initialized to give XL, in which the LINK field points 
to IS or a pseudo-IS element. XL gets the current iteration 
value, adds lORG, and pushes the result to VS with tag ST. 

Subtract lORG from ST entry on top of VS; store in XI field 
of XT entry at K-LINK in QS; if the value just stored is 
negative or greater than the X2 field of the same word, 
signal an error. Set the X3 field (change bit) to 1, and 
pop VS. 

Set the change bit (X3 field) of the referenced XT entry to 1. 

ISC is initialized to SC in same way as IXL. The VALUE 
field of the instruction is a QS segment descriptor. If the 
change bit in the referenced IS or pseudo-IS entry is 1, 
then the segment is activated. Otherwise, the change bit 
of the XT entry referenced by the following instruction is 
set to 0, and this instruction is skipped. 
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EXAMPLE 6: 

This example illustrates typical behavior of the E-machine. Consider the 
A PL statement 

£'[!;>£'?> ri+(+/(l 2 2 (5???° . -P2'[I;])*2)*0, 5 
and suppose it is encountered by the machine when the variables are as below: 
EP is 0. 0001 



I is 2 
















PT is 





E 


is 





1 


1 








1 






1 








1 


1 





















1 


1 



















The D- machine code for this statement is as follows: 
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D- Machine Code for Statement in Example 6: 



Addr Op 



Operand 



Comments 



200 


LDS 


0.5 






202 


LDS 


2 






204 


LDSEG 


SCODE(0, 0, 


0) 


Empty subscript 


206 


LDNF 


I 






208 


LDNF 


PT 






210 


SUBS 


2 




Result is PTII;1 


212 


LDNF 


PT 






214 


GDF 


SUB 




PTo.-PTlI;} 


216 


LDCON 


50 




Constant vector 1,2,2 


218 


TRANS 






12 2 >S)PT°.-PTLli'} 


219 


PWR 






(12 2 ^PTo.-PTLl;'])*2 


220 


RED 


1 ADD 




+/(1 2 2 IS)PTo.-PTlI;i)*2 


223 


PWR 






(+/(! 2 2 is?Pro.-P!Z'[I;])*2)*0,5 


224 


LDS 


-1 






226 


ADD 






~l+i? 


227 


MOD 






ri+R 


228 


LDNF 


EP 






230 


GT 






EP>\~1+R 


231 


LDSEG 


SCODE(0, 0, 


0) 


Empty subscript 


233 


LDNF 


I 






235 


LDN 


E 






237 


SUBS 


2 




ELI;1 


239 


ASGN 






ELl;l^EP>ri+E 


240 


O O 








250 


RC=1 


LEN=4 




Header for DA of constant 1,2,2 


251 


VB=0 


AB=54 




Rest of DA 


252 


RANK 


=1 






253 


R(l)=3 


D(l)=l 






254 


RC-1 


LEN=4 




Header for value of constant 1 , 2 


255 


1 








256 


2 






Value array 


257 


3 
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Example 6-1 shows the instruction buffer containing the deferred code to 
evaluate the sample statemento The transpose operation was evaluated in the D- 
machine using beating, and its results are manifested in the access masks (AUX 
field) in the instructions at locations 3 and 4o 

Four temporary descriptor arrays were created by the DM as follows: 
@T1 DA for PTC 2 ; ] . (Recall that I is 2 in this example* ) 

@T2 DA containing dimension of the result of the GOP operation, 

in this case 4,2o 
@T3 DA containing dimension of the reduction result, in this case 4, 

@T4 DA for E'L 2 ; ] 

The deferred code is equivalent to the following: 
for J ?= step 1 until 3 do 
begin 

REDUCE := 0; 

for K := 1 step - 1 until do 

REDUCE := REDUCE + (PT[J;K]-PT[2 ;K])*2; 
E[2;j] := Oo 0001>ri-f(REDUCE*0.5); 
end 
The remainder of the example shows the D- machine's progress through the code 
in QS, and contains comments which explain the machine's actions at each step. 
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REG I ST EK OUHH 
NEtalT s I lURG 



REL OAG 



FHEG « OOCOO FBASfc * 00200 
LEN 0/fe IS FN NwT UP 



ISMK 



00 



CTR MAX DIK CH HKK 
«. 4. ^ 4. *. «•-.-«. «■ «• IS: ♦--- — ♦■ ♦ ♦ ♦ ♦• 

I C4C t coo I 075 I i ( 1 I 3 I 00 I | 000 I 003 t I I I 1 I 
I 000 I 000 I 022 I I I I I I 3 I 00 I ~> I 



EFFECTIVE AUOR « 0000 



TAG VALUE 




-♦ QS: 
1 00 


OP 


VALUE 


LINK 


AUX 


1 FMT 1 FCODEt- 


IfO.F) 


1 s 


0,b 


♦ 


1 


— >l 




CI 

02 


1 RED 

1 s 



2 


08 1 


1 
1 






03 


1 IFA 


«i>Tl 




0001 1 






04 


1 IFA 


*PT 




OOU 1 






Ob 


1 GUP 


SUB 


02 1 


0011 1 






06 


1 IRQ 


«T2 




0011 1 






C7 


1 OP 


PWR 


05 1 


OOU 1 






08 


I UP 


ADD 


07 1 


OOll 1 






09 


1 SGV 


SCOOEISfcG.A.n 




1 






10 


1 s 


-2 




\ 






il 


1 MIT 







1 






12 


1 IRU 


^T3 




0001 1 






13 


1 OP 


PtaR 


11 1 


0001 1 






U 


1 s 


-I 




1 






15 


1 OP 


ADD 


02 i 


0001 1 






16 


1 OP 


MCD 




0001 1 






IT 


1 s 


O.OOOl 




1 






18 


1 OP 


GT 


02 1 


0001 1 






19 


1 lA 


aT4 




0001 i 






20 


) OP 


ASGN 


02 1 


00c I 1 






21 


1 POP 







t 



THE D-MACHINE' HAS JUST PASSED CONTHUL TU THE E-MACHINE. NO EXECUTION 
HAS TAKEN PLACE YET. THE FUNCTION MARK UN VS WAS PLACED THERE BV 
ACTIVATING FUNCTION F. THE CONTENTS OF THE MARK ARE THE PREVIOUS 
VALUES OF FREG (-1) AND lORG (0»f ANO THE NAME OF THE FUNCTION IF). 

SEGMENT A WITHIN QS EVALUATES THE REDUCTION FOUND IN THE SOURCE 
CODE. THE ITERATION STACK IS SET UP TO DO THE EQUIVALENT OF THE 
-FOR J :« STEP I UNTIL 3" ITERATION. 



EXAMPLE 6 — E-MACHINE 
MEMORY DUMP 



AUOR CONTENTS 



VPT 


RC = 2 


♦01 





♦ 02 





♦ 33 





♦C<t 


1 


♦0 5 


1 


♦ 06 





♦0 7 


I 


♦ 08 


I 


VE 


RC = 2 


♦ 01 





♦0 2 


1 


♦ 03 


I 


♦ OA 





♦05 


I 


♦ C6 





♦ C7 





♦06 


I 


♦ 09 





♦ 10 





♦ 11 





♦ 12 





♦ 13 





♦14 





♦15 





♦ 16 


c 



AOCR CCNTtNTS 



NT: TAG CONTENTS 



iPT 


RC»2 


LEN»05 




FT 


•IF 


♦ 01 


Vb=VPT 


Ab'OOO 




ST 


2 


♦ 02 


RANK» 


2 


PT 


DT 


SPT 


♦ 03 


R(l)»004 


OiU-02 




OT 


aE 


♦ 04 


K(2)»00? 


D(2)*0l 


EP 


ST 


O.OOOl 


dE 


RC«l 


LEN«05 








♦ 01 


V6=Vfc 


Ab=OO0 








♦ 02 


RANK* 


Z 








♦ 03 


Rin=004 


oin=04 








♦ 04 


R{2)=004 


Dl7)»0l 








olTl 


RC=l 


LEN»04 








♦ 01 


va«vPT 


AB»004 








♦ 02 


RANK« 


I 








♦ 03 


K(n=oo2 


0( U>01 








(ST 2 


RC»1 


LEN=05 








♦ 01 


VB= 


A8=000 








♦ 02 


RANK* 


2 








♦ 03 


Rtll=004 


0( 11*02 








♦04 


Rt2M002 


0(2)=0l 








«IT3 


RC*l 


LEN=04 








♦01 


VB= 


AB=000 








♦02 


RANK^s 


I 








♦ 03 


kU)*004 


0( l)»01 









ST4 RC*l LEN=04 

♦01 va=V£ AB=008 
♦02 RANK»1 

♦ 03 R<U*C04 D<1»»01 

NOTE THAT IN THE NAMETABLE* THE ENTRY FOR THE IDENTIFIER F POINTS 

TC AF. THE TAG OF THE ENTRY IDENTIFIES IT AS A FUNCTION NAME. 

aF IS THE ADDRESS OF THE FUNCTION DESCRIPTOR FOR F, WHICH IS NOT SHOWN, 

EXAMPLE 6-2: STATE OF MEMCRV BEFORE EXECUTION 



EXAMPLE 6-1: STATE OF THE REGISTERS BEFORE EXECUTION 



EXAMPLE 6 — E-M4CHINE 



PEGISTEft DUMP 

KEWIT « I lORG ■ FREG » OOCOO F8ASE « 00200 ISHK « CO 

R£L ORG LEN 0/E IS FN NhT QP CTR MAX OIR CH NRK 

LSt * — ♦ ♦-- ♦ ♦ ♦ ♦ ♦ ♦ IS J ♦ ♦ ♦— ♦ ♦ ♦ 

t 040 I 000 I 075 I i I I I 3 I 00 I | 000 | C03 i I 1 I 1 1 

I 001 I 000 t 022 111 I I 1 3 I 00 I — > t 
— > I 



EXAMPiE 6 — E'MACHINE 



REGISTER DUMP 
KEWIT » 1 lORG 



FREG ' 00000 



FBASE * 0020 C ISMK « 01 



REL ORG LEN U/E IS FN NtiT OP 



I 040 I COO I 015 
I 012 I 000 I 022 

I 000 I 002 } 00 7 



U t 3 I 1 I 3 I 00 I 
II I I I 3 I 00 I 
I I I I I I I 00 I —> 



MAX OIR CH MRX 



I 000 I 003 I 1 1 I 1 I 
I 001 I 001 I I t 1 I t I 



CO 

o 

I 



EFFECTIVE ADOR * 0001 IN QS 

TAG VALUE OP VALUE LINK AUX 

VS:* ♦ ♦ QSi* ♦ * ♦ 

I FMT I FCOOEt-ltOfFI | ***0S UNCHANGED*** 

I ST I 0.5 I 

— >l 

THE S INSTRUCTICN ILUAO SCALARI PUSHED ITS UPERAnC 10. 5» 1Q VS. 
EXAMPLE 6-3: AFTER S 



REGISTER DUMP 

NEWIT * I lOKG * 



FREG « OCOOO 
LEN D/E IS FN NWT QP 



FBASE « 00200 ISMK - 00 

CTR MAX OIR CH MRK 
I 1 I I I 



REL ORG 
LS: ♦ ♦ ♦ ♦ ♦ •• •■ ♦ ♦ IS: ♦ ♦ 

f 040 I 000 I 075 i I I I I 3 I 00 I I 000 I 003 
I Oil I 000 I 322 I 1 I 1 I I 3 I 00 I —> I 
"> I 

EFFECTIVE AODR * oOll IN gS 



TAG 
1 FMT 


VALUE 


OP 


VALUE LINK AUX 


1 FCOOE(-1»0»F) 




♦**QS UNCHANGED*** 


1 ST 


\ 0.5 






1 RT 


1 






J SGT 


1 SCOOEtSEG.AtU 






1 ST 


1 -2 






— >1 









THb RED OPERATOR POSHED THE RT ENTRV, TO 6t USED AS AN ACCUMULATOR 

FOK THE RECUCTIGNf AND JUMPED TO QS(9;I. THE SGV INSTRUCTION (AT 91 

PUSHED ITS OPERAND (THE DESCRIPTOR FOR SEGMENT A» TO VS. 

THE S INSTRUCTION CAT 101 PUSHED THE -2 VALUE TC VS. 

THESE TUG ENTRIES WILL BE USED BY THE MIT INSTRUCTION TO ACTIVATE 

THE REDUCTION SEGMENT. 

EXAMPLE 6-4: AFTER REDi SGVt AND S 



LINK AUX 



•**QS UNCHANGED*** 



EFFECTIVE ADOR = 0002 IN QS 

TAG VALUE 

VSJ* ♦ ♦ QSS ♦ 

i FMT I FCODEI-UOffi I 
I ST I 0.5 I 

I RT I I 

— >l 

MIT USED THE SCALAR -2 CN TOP OF VS TO START A NEM ITERATION. 
THE LENGTH OF THE ITERATION IS 2» AND THUS THE MAX FIELD IN THE ITERATION 
STACK IS SET TO 1. THE NEGATIVE SIGN OF THE VS ENTRY SIGNIFIED THAT THE 
ITEHATION IS TO RUN BACKWARDS (DIR*!}** HENCE CTR STARTS AT 1 INSTEAD OF 0. 
THE NEXT VS ENTRY MAS A SEGMENT DESCRIPTUR FOR SEGMENT A IN QS. 
MIT USED THIS TO ACTIVATE THE SEGMENT, BY PUSHING A NEW ENTRY TO LS. 
NOTE THAT IN THE NEW LS ENTRY, THE NhT iilT IS li THIS WAS THE PREVIOUS 
VALUE OF NEWIT. NEWlT IS NOW 1 BECAUSE A NEW ITERATION HAS BEEN STARTED. 

EXAMPLE 6-5: AFTER MIT 

REGISTER DUMP 

NEWIT > I lORG « FREG « OOOOO FBASE « 00200 ISMK > 01 

REL ORG LEN 0/E IS FN NWT QP CTR MAX OIR CH MMK 

LS: ♦ ♦ ♦ ♦ — -♦- — ♦ ♦ — -♦ — — ♦ IS: ♦ ♦——-♦-—♦-—♦—♦ 

I 040 I 000 I 075 I I I I I 3 I 00 I t GOO I C03 1 I 1 I I I 

I 012 I 000 I 022 I 1 I 1 I j 3 I 00 1 ] 001 ] COl I 1 I 1 1 1 I 

I 001 I 0C2 I 007 I I I I I I I I 00 I —> I 
— > I 

EFFECTIVE ADOR = C003 IN CS 

TAG VALUE 



VALUE LINK AUX 

.................. — ^.-.. ....... 

***QS UNCHANGED*** 



VS:* ♦ ♦ QS:* 

I FMT I FCO0E(-l,0,F) I 

I ST I 0.5 I 

t RT I I 

f ST I ? I 

— >l 

THE FIRST INSTRUCTION OF THE NEWLY-ACTIVATED SEGMENT (SEG.A) IS S , 
AT QS(2il. THIS INSTRUCTION PUSHED ITS OPERAND (2) TO VS. 



EXAMPLE 6-6S AFTER S lAT QS(2;I I 



EXAMPLE 6 — E-M'ACHINE 



REGISTER DUMP 
NEWIT « I 



FREG - 00000 



FBASE « 0Q2CC 



REL ORG LtH 0/E IS FN NIfiT QP 



CTR MAX OIR CH MHK 



I 040 I 000 I 075 I i I 1 I 3 I CO I I 000 j C03 I I 1 ) 1 I 
I 012 i 000 I 022 t I I 1 I } 3 I 00 i I 001 ) 001 I 1 I L I I I 
I 001 I 002 I OCT I I I 1 I I I I 00 I —> I 



EXAMPLE 6 — E-MACHINE 



REGISfER DUMP 
NEhir « I I ORG 



FREC • QOOOO 



REL ORG LEN U/E IS FN NhT QP 



CTR 



MAX OIR CH MRK 



-♦ is: ♦- 



I 040 I 000 I 075 I t 1 I I 3 i 00 I I 000 I 003 I I I I I | 

( 012 i 000 I 022 1 I I 1 I I 3 i 00 I I 001 I OOL i I I I I 1 I 

I 002 i 002 I 0C7 I 1 I i I i I i 00 I — > I 
I 



EFFECTIVE ADDR » 0003 



EFFECTIVE AOUR > CC04 IN US 



I 

CO 

M 

I 



TAG 


VALUE 




-♦ QS:* 
1 00 1 


OP 


VALUE 




LINK 


AUX 


vsj ♦--— - 

1 FMT 


1 FCOOEI-l 


,0,FJ 


S 1 


0.5 




t 1 


\ 


1 ST 


1 0.5 




1 01 I 


RED 1 







1 08 1 


1 


1 RT 


1 




1 02 i 


S 1 


2 




1 t 


1 


1 ST 


1 2 




1 03 1 


FA 1 


UCOOEIVPT* 


4} 


! 19 1 


J 


— >l 






04 1 


IFA 1 


JPT 




1 \ 


0011 1 








05 1 


GOP 1 


sua 




1 02 1 


0011 1 








06 J 


IRQ 1 


AT 2 






OOll 1 








07 1 


GP 1 


PnR 




1 05 1 


OOll \ 








08 1 


OP ! 


AOO 




1 07 1 


0011 1 








09 1 


SGV 1 


SCOOEISEG. 


Aa) 


1 t 


1 








10 1 


S i 


-2 




1 i 


1 








n 1 


MIT 1 







t i 


1 








12 1 


IRQ 1 


iT3 




i i 


0001 1 








13 1 


OP 1 


PMR 




\ 13 1 


pool 1 








14 1 


S 1 


-I 




1 1 


j 








15 1 


UP 1 


AOO 




1 02 1 


0001 1 








16 i 


OP 1 


MUD 




1 t 


0001 1 








17 1 


S 1 


Q.OOOl 




1 i 


1 








18 j 


OP 1 


GT 




1 02 1 


0001 1 








19 1 


lA \ 


4T4 




1 1 


0001 1 








20 1 


OP 1 


ASGN 




1 02 i 


0001 1 








21 1 


POP 1 


C 




1 1 


1 








22 1 


NLT 1 


acooEii.n 




t 01 1 


1 








— > 1 













TAG 



VALUE 



A_ 



_A 



LOCATION 3 IN Q%, MHICH PREVIOUSLY CONTAINED AN IFA INSTRUCTION, HAS 
SEEN INITIALIZED TO FA. THE VALUE FIELD NOW CONTAINS VPT , THE BASE 
AOORESS REFERENCED IN THE DA AT aTl, ANU THE ABASE («4) FROM THAT OA. 
IN AOOITION. THE LINK FIELD OF QSI3;) IS NOh A RELATIVE POINTER TO 
QSI22il« MHICH IS THE ITERATION CONTROL BLOCK FOR THIS ARRAY. THE SECOND 
ELEMENT OF THE ICU ENTRY (I.E. THE Q2 FIELD) IS THE DEL FOR THIS ARRAY, 
TAKEN FROM aTl. tSEE EXAMPLE 6-2 FOR CONTENTS OF TU. THE FIRST ELEMENT 
(Ql FIELDI IS DEL TIMES THE MAX VALUE IN THE TOP ENTRY ON IS. 

LS HAS NOT CHANGED VET BECAUSE THE NEhLV-CREATED FA INSTRUCTION HAS 
NCT VET BEEN EXECUTED. THE INITIALIZATION PROCESS ALSO ERASED THE DA 
STARTING AT ATI, MHlCH IS NO LONGER REFERENCED ANYWHERE IN THE MACHINE. 



FMT 

ST 

RT 

ST 

ST 



I FCOOE(-l,0,Fl 
I 0.5 
I 
I 2 
t 



tiS: 

oc 

01 
02 
03 
04 
05 
06 
07 

oa 

09 
10 

11 

12 
13 
14 
15 
16 
IT 
18 
19 
20 
21 
22 
— > 



OP 

s 

RED 

S 

FA 

IFA 

GOP 

IRD 

OP 

OP 

SGV 

S 

MIT 

IRD 

OP 

S 

OP 

OP 

s 

UP 

lA 

OP 

POP 

NLT 



VALUE 


LINK 


AUX 


0.5 


1 1 


1 





1 08 1 


1 


2 


1 1 


1 


0COOEIVPT,5» 


) 19 1 


1 


aPT 


1 1 


OOll 1 


SUB 


1 02 1 


OOll 1 


aT2 


1 ( 


OOll 1 


PHR 


1 05 \ 


OOll 1 


ADD 


1 07 1 


OOll 1 


SCODE(SEG.A,U 


i 1 


1 


-2 


1 1 


1 





\ 1 


1 


ai3 


1 1 


0001 1 


PWR 


t 13 t 


0001 1 


-1 


1 1 


t 


AUO 


1 02 1 


0001 1 


MOO 


1 1 


0001 1 


0.0001 


1 1 


1 


GT 


1 02 i 


0001 1 


aT4 


1 1 


0001 1 


ASGN 


1 02 t 


0001 1 





1 1 


1 


QCODE (1,1) 


1 01 1 


1 



THE AOORESS IN QS(3;) HAS BEEN UPDATED BY THE INDEX UNIT AND THE VALUE 
IT REFERS TO HAS BEEN PUSHED TL VS. THUS THE VALUE (0) ON TOP OF VS 
AT THIS POINT IS PT<2;1). (RECALL THAT THE EFFECTIVE AOORESS OF AN 
ARRAY ELEMENT REFERENCED IN AN FA INSTRUCTION IS THE SUM OF ITS COOEU 
PARTS, PLUS 1 (TO COMPENSATE FOR THE ARRAY HEADER WCRO) I. 

EXAMPLE 6-B: AFTER FA 



EXAMPLE 6-Tt AFTER IFA 



EXAMPLE 6 — e-NACHINE 



CO 

to 



REGISTER DUMP 




















KEMIT - 1 lORG * 




FRE6 « OOCOO 


FdASE 


» 002C0 


ISMK 


« 01 


REL ORG 


LEN 


0/e 


IS FN 


NWT 


OP 


CTR 


NAX 


DIR 


CH MRK 
♦ ♦- — ♦ 

till! 


LS! ♦——♦——-♦. 
1 040 1 000 1 


75 


1 1 I I Ts 1 


00 1 1 000 i 


003 


1 


i 012 1 000 1 


022 


1 1 


1 1 


3 1 


00 1 


001 1 


001 


1 1 


t 1 1 1 1 


1 003 1 002 1 


007 


1 1 


I i 


1 1 


00 1 ~> 










— > 1 




















EFFECTIVE ADOR ' 0005 


IN QS 














TAG VALUE 








OP 


VALUE 






LINK 


AUX 


vst* ♦ 


..... 


— — 


— ♦ «S:* 


^.......-. 


........ 


— ♦ 


....« 


........ 


1 FNT 1 FCOOEI- 


-l.O, 


Fl 


1 00 


S 


1 o.s 






1 


1 


1 ST 1 0.5 






1 01 


RED 


1 






08 1 


1 


1 RT 1 






» 02 


s 


I 2 






1 


1 A_ 


1 ST 1 2 






\ 03 


FA 


1 4C00EIVPTt5l 




19 i 


1 


I ST 1 






1 04 


F* 


t QC00E(VPT,ll 




19 1 


1 


1 ST 1 






1 05 


GOP 


t SUB 






02 1 


0011 1 


— >l 






06 


IRO 


1 -T2 






1 


OOll 1 








07 


OP 


i PMR 






05 1 


0011 ) 








08 


OP 


1 AOO 






07 1 


OOll 1 .A 








09 


SGV 


t SCOUElSEG.Atll 




1 


1 








to 


S 


f -2 






1 


1 








u 


MIT 


i 






J 


1 








12 


IRO 


1 ^T3 






1 


0001 1 








13 


OP 


1 PMR 






U 1 


0001 1 








14 


S 


1 -I 






1 


1 








15 


OP 


i ADO 






02 1 


0001 1 








16 


OP 


1 MOO 






1 


0001 1 








17 


s 


1 0.0001 






1 


1 








18 


OP 


1 GT 






02 1 


0001 i 








19 


lA 


1 aT4 






{ 


0001 1 








20 


OP 


1 ASGN 






02 1 


0001 1 








21 


POP 


t 








1 








22 


NLT 


1 Qcooeti 


ai 




01 1 


1 








23 


NT 


1 iK00E(6 


r2l 




1 


1 








24 


NLT 


i QCOUEU 


>ll 




01 1 


1 








— > 















REGISTER DUMP 
NEMIT » 1 



FREG « 00000 



FBASE « 00200 



ISMK 



01 



— > I 



REL ORG LEN 0/E IS FN 

I 040 I 000 i 075 I I I 1 

I 012 t 000 I 022 1 1 I 110 

t 005 I 002 ( 007 t 1 I 110 



— >l 



NtoT i^P 

♦ >....^ 

I 3 I 00 I 
I 00 I 
} 00 I 



I 3 



CTR MAX OIR CH MRK 
4> ...-.♦. — ..<f...4. — .«^...# 
i 000 t 003 I I 1 I 1 1 
I 001 I 001 I 1 t 1 I 1 I 



-> I 



EFFECTIVE AODR • 0007 IN aS 



TAG VALUE 

VSJ ♦ ♦ ♦ 

I FMT I FCUUE(-l»OiF» | 

I ST I 0.5 I 

i RT I 1 

I ST I 2 I 

I ST I I 



ttS»* 


...••4 




..«.....«. 


....... 


00 1 


s 


0.5 


1 t 


1 


01 1 


REO 





1 08 1 


1 


02 1 


S 


2 


1 1 


1 


03 1 


FA 


QCOOEfVPT.S) 


1 19 1 




04 t 


FA 


QCOQEIVPTttI 


t 19 1 


1 


05 1 


GOP 


SU8 


1 02 1 


OOll 1 


06 1 


NIL 





1 1 


1 


07 1 


OP 


PWM 


1 OS 1 


OOll 1 


08 1 


OP 


AOO 


1 07 1 


OOll 1 


09 ( 


SGV 


SCOOEISEG.A.lt 


i 1 


1 




s 


-2 


1 1 


1 




MIT 





1 1 


1 




IRO 


4li 


t i 


0001 1 




OP 


PWR 


1 13 1 


0001 1 




s 


-1 


1 1 


1 




OP 


AOO 


t 02 1 


0001 1 




OP 


MOO 


1 1 


0001 1 




s 


0.000 I 


t 1 


! 




OP 


GT 


t 02 1 


0001 1 




lA 


dT4 


t 1 


0001 1 




OP 


ASGN 


1 02 1 


0001 1 




POP 





1 i 


J 




NLT 


QCOOEU.ll 


1 01 1 


1 




NT 


0CaOEC6«2t 


1 « 


1 




NLT 


.JCOOEU.ll 


1 01 1 


1 


— > 1 











THE IFA AT QSf4{} HAS BEEN CHANGED TO FA * AS IN EXAMPLE 6-7» AND THE 
FA HAS BEEN EXECUTED, AS IN 6-8. THE TOP TMO ELEMENTS ON VS ARE NOW 
PT(2tl} AND PTtOlll. ALSO NOTE THE TWO NEM ENTRIES ON THE TOP OF US t 
WHICH ARE THE ICB FOR THE FA AT QS<4il. 



THE SUB HAS BEEN DONE. UN THE E-HACHINE* GOP IS TREATED SAME AS OP.t 
THE IRO OPERATION DECREASES THE REFCO Uf ITS OPERAND BV 1 AND REPLACES 
ITSELF BV NIL. THE NO>OP» BECAUSE IRO IS USED BV THE 0>NACHINE BUT 
NOT BY THC E-MACHINE. 



EXAMPLE 6-9: AFTER QS<4l) UNITI ALI2AT10N AND EXECUTIONI 



EXAMPLE 6-101 AFTER SU8,IR0 



EXAHFLE 6 » E-HACHINE 



EXAMPLE 6 — E-HACHINE 



CO 
CO 



REGISTER DUHP 
KEkitT - 1 lORG 



F«EG » 00000 
LEN D/E IS FN NliT UP 



FBASE « 00200 



ISHK 



01 



REL ORG LEN D/E IS FN NliT UP CTK MAX OIK CH MRK 

t_S: 4. ♦ «. 4. — -♦ ♦ ♦- — » «. isj 4. « ♦ ♦• — fr...4. 

I 040 I 000 i 075 I I I 1 I 3 I 00 t | 000 I 003 I I t t 1 t 

I 012 1 COO I 022 I 1 1 I I I 3 I CO 1 I OCl I 001 I 1 I 1 I I t 

I 006 I 002 I OCT I 1 I I I 1 L I 00 I — > i 
— > I 



EFFECTIVE AOOR » 0008 


IN QS 


TAG VALUE 




vst ♦ #.- 


-♦ asi 


1 FMT 1 FCODE«-lfO.F| 


1 


1 ST 1 0.5 




1 RT f 


1 


1 ST 1 


1 


— >l 





LINK AUX 



♦♦♦OS UNCHANGEO^^^ 



PUR (AT QSITin MAS APPLItO TO THE TOP 2 ELEMENTS UN THE VALUE STACK, 
AND 2 ; THESE OPERANDS m£KE DELETED AND THE RESULT OF THE UPERATitiN 
HAS BEEN PUSHED TO VS. (0 ♦ 2 • OJ 

EXAMPLE 6-11: AFTER PWR 



REGISTER DUMP 
NEUIT « 1 lORG 



FREG « OOCOO FBASE * 0020C ISMK - 01 

CTR MAX DIR CH MRK 



REL ORG LEN D/E IS FN NUT UP 

LS: ♦ — «..«,.. — «♦ ♦ > f — -♦— -♦ -♦ IS: ♦ — - — ♦ ♦- — ♦ ♦- — ♦ 

I 040 I 000 I 375 i ) I 1 I 3 I 00 I i 000 I 003 j t 1 I 1 I 
I 012 I 000 i 022 I 1 I t I i 3 ) CO 1 t 001 I 001 I 1 I I J 1 I 
I 007 I 002 I 007 I I J I I I I I 60 1 —> I 

— > I 



VALUE LINK AUX 

♦ ♦ ♦ 

•••QS UNCHANGED*** 



EFFECTIVE AODR « C009 IN QS 

TAG VALUE 
VSi ♦ ♦ ♦ QSs ♦- 

I FMT I FCODE(-U0,F) I 

I ST I 0.5 I 

I ST I I 

— >l 

THE ADD OPERATIONt SEEING THAT ITS SECOND OPERANli HAS TAG RT, 
GIVES AS ITS RESULT THE FIRST OPERAND, kilTH TAG ST. THIS IS 
ACCORDING TO THE DEFINITION OF REDUCTION. 



REGISTER DUMP 
NEMIT « 



FREG * 00000 



FttASE » 00200 ISMK « 01 



REL ORG LEN D/E IS FN NWT QP 



CTR 



MAX OIR CH MRK 



LS: ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ #• ISs ♦ ♦ ♦ ♦ « *■ 

I 040 I 000 I 075 1 I I I t 3 I 00 I I 000 I 00) t t 1 I I J 
I 012 I 000 I 022 I 1 I I I I 3 t 00 I I 000 I 001 I I | 1 | I | 
I 000 I 002 I 007 I 1 I 1 I I 1 I 00 I —> I 
— > I 



EFFECTIVE AOOR » 0002 IN GS 



TAG VALUE 



— >l 



I FMT I FC00E(-l 
t ST I 0.5 
I ST I 



IN THE LAST F 
ADDRESS MAS THE 
MAS SET FOR THA 
THUS, LS MAS RE 
SE REPEATED Ml 
AT THIS POINT, 
HAS BEEN DONE 



OP VALUE LINK AUX 

♦♦♦QS UNCHANGED*** 



IE» THE SEGMENT mAS COMPLETED, SINCE ITS RELATIVE 
SAME AS ITS LENGTH. HOWEVER, SINCE THE IS BIT 
r SEGMENT, THE IS MAS STEPPED BUT UIDN*T OVERFLOW. 
INITIALIZED TO THE BEGINNING OF THE SEGMENT, TO 
THE NEW IS VALUES. NOTE THAT NEWIT NOW IS 0. 
THE EQUIVALENT OF THE ALGOLIC "REDUCE :» REDUCE ♦ ..." 
J*0 AKO K«l. 



T-l 



F3R 

THE SECCND PASS THROUGH THE REDUCTION SEGMENT PROCEEDS SIMILARLY 
TC THE FIRST, EXCEPT THAT NO FURTHER INIT I AL UATIONS NEED BE DONE. 
AT THE END OF THIS ITERATION, REL^LEN IN LS ANO« AS BEFORE, THE 
ITERATION STACK WILL BE STEPPED. HOWEVER, THIS TIME IT OVERFLOWS, 
SO BOTH LS AND IS ARE POPPED* RETURNING THE MACHINE TO THE 
MAIN SEGMENT. (SEE NEXT FIGURE! 

EXAMPLE 6-13: BEGINNING OF SEGMENT WITH SlEPPtD IS 



EXAMPLE 6-12S AFTER ADO 



EXAMPLE 6 — E-HACHINE 



REGISTER DUMP 
NEMIT « I lORG 



FREG « 00000 FBASE « 00200 ISMK « 00 

CTR MAX OIR CH MRK 

♦- — "♦ ♦-—♦—♦. 

I 1 I I I 






REL ORG iEN D/E IS FN NhT QP 

LS» ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ IS 

I 040 I 000 I 075 I I I 1 I 3 I 00 I 1 000 I 003 
I 012 I 000 ) 022 I 1 i I I I 3 1 00 t —> 1 
--> I 

EFFECTIVE AOOR « 0012 IN gS 



EXAMPLE 6 — E-MACHINE 

REGISTER DUMP 

NEMIT - 1 I ORG - FREG > 00000 FBASE < 

REL ORG LEN 0/E IS FN NliT OP 
LS: ♦• ♦ — - — ♦ ♦ ♦ ♦ ♦ ♦ ♦ IS: ♦> ♦ ♦ ♦ — 

1 040 I 000 I 075 I I I I } 3 I CO t I 000 I 003 | I 1 
t 020 I 000 I 022 i i I L i I 3 I 00 I »> I 
— > I 

EFFECTIVE AOOR « 0020 IN QS 



00200 ISMK « 00 
CTR MAX DIR CH NRK 
1 t 



TAG 

VS:* 

1 FMT 


VALUE 
i. 

1 FCQOEI- 




-♦ iJS:* 
1 00 1 


OP 


VALUE 


LINK 
— ♦- ♦ 

1 1 


AUX 

J 


-ItOfFI 


S i 


0.5 


1 ST 


1 0.5 




1 01 1 


RED J 





1 00 1 


1 


1 ST 


1 1 




i 02 1 


S 1 


2 


1 1 


1 


~>l 






03 1 


FA 1 


QC00E(VPT,4I 


1 19 1 


1 








04 1 


FA 1 


QCOOElVPTtOl 


t 19 1 


1 








OS 1 


GOP 1 


sua 


1 02 i 


0011 1 








06 1 


NIL t 





t t 


1 








07 1 


OP 1 


PM 


1 05 1 


0011 1 








08 1 


OP i 


ADD 


1 07 \ 


0011 1 








09 1 


SGV 1 


SCOOEISEG.A.ll 




1 








10 1 


S 1 


-2 


1 1 


1 








11 t 


MIT 1 





1 1 


1 








12 1 


IRD 1 


»li 


i 1 


0001 1 








13 1 


OP 1 


pyR 


1 13 1 


0001 t 








14 1 


S 1 


-I 


1 i 


1 








15 1 


OP 1 


ADO 


t 02 1 


0001 1 








16 1 


OP 1 


MOO 


i 1 


0001 1 








17 1 


S 1 


0.0001 


J i 


1 










OP 1 


GT 


1 02 1 


0001 1 








19 1 


lA 1 


4»T4 


1 f 


0001 1 








20 1 


OP 1 


ASGN 


i 02 1 


OOOi 1 








21 1 


POP 1 





1 1 


1 








22 1 


NLT I 


QCOOEUtll 


) 01 1 


1 










NT 1 


QC00E(6i2l 


1 1 


1 










NLT 1 


OCUOEdtll 


1 01 1 


1 



















TAG 


VALUE 


-♦ QS:< 
i 00 i 


OP 


VALUE 


LINK 


AUX 


VS* ♦— — 
1 FMT 


1 FCOOEI-ltO.FI 


S 1 


0.5 


1 1 




1 ST 


1 1 


1 01 


RED 1 





1 08 1 




1 AT 


1 QC00E(VE«8) 


1 02 1 


S \ 


2 


1 t 




— >l 




03 


FA 1 


UCO0E(VPTt4l 


1 19 1 








04 


FA 1 


QCOOEIVPT.OI 


i 19 1 








05 


GOP 1 


SUB 


t 02 1 


0011 1 






06 


NIL i 





1 1 








07 


OP 1 


PWR 


1 05 1 


0011 t 






08 


OP \ 


ADD 


1 07 1 


0011 i 






09 


SGV 1 


SCO0E(SEG.A«n 


1 1 










S 1 


-2 


1 1 










MIT 1 





i i 










1 NIL 1 





1 1 










OP 1 


PHR 


1 13 J 


0001 1 








S 1 


-I 


1 1 










OP 1 


ADO 


1 02 1 


0001 1 








OP 1 


MUD 


1 1 


0001 1 








S 1 


C.OOOl 


1 1 










I OP 1 


GT 


1 02 1 


0001 t 








A 1 


QC00E1VE.8I 


1 06 1 










OP 1 


ASGN 


t 02 1 


0001 1 








POP 1 





i 1 










NLT 1 


QCOOEU*!! 


1 01 1 










NT 1 


QC00Ef6t2l 


1 1 










1 NLT 1 


QCOUEIiai 


I 01 1 










i NLT 1 


QC00E(3tll 


1 i 








— > 


1 









REDUCE SEGMENT IS DONE. ITS RESULT 111 IS ON TOP OF VS. 
NOTE THAT NEWIT HAS RESTORED TO 1 WHEN LS MAS POPPED. 

THIS STAGE CORRESPONDS TO THE COMPLETION OF THE "FOR K" LOOP HlTH J«0. 

EXAMPLE 6>14t AFTER RETURN FROM REDUCTION 



QSU2I0} THROUGH QSIt9tl HAVE BEEN EXECUTED. NOTE THAT THE lA AT QS(19tl 
MAS TRANSFORMED TO A AND THAT ITS RESULT IS THE COOEO ADDRESS MITH 
TAG 'AT* ON TOP OF VS. 

EXAMPLE 6- I SI BEFORE ASGN 



EXAMPLE 6 ~ E-HACHINE 



REGISTER DUMP 

NEW IT * I iORG 



REL ORG LEN 0/E IS FN NMT QP 

LSs ♦ — »> — - — ♦ '^ — -♦ ♦- — ♦ ♦ ♦ iSs 

I 040 I 000 I 075 I I I I 1 3 I 00 t 
I 022 i 000 I 022 I I I 1 I t 3 I 00 i — > 
— > I 



FREG « 00000 FBASE > 00200 tSNK « CO 

CTK MAX OIR CH NRK 
000 t 003 I I 1 I 1 I 



EXAMPLE 6 --^-MACHINE 

REGISTER DUMP 

NEMIT - lURG ■ FREG « 00000 

RtL ORG LEN D/E IS FN NWT QP 



F8ASE * 0020C ISMK ' 00 



EFFECTIVE AUDR 
TAG VALUE 



0022 IN QS 



I 040 i 000 I 075 i I I 1 I 
i 022 I GGO I 022 I I I I ) I 
— > 1 

EFFkCTlVE AOOR ' 0022 IN QS 

TAG VALUE 



CTR MAX OIR CH MRK 

3 I 00 I t 003 i 003 I t I I i I 

3 I CO J —> I 



— >l 



I FMT I FCOOe4-l,0tFI 



•♦♦«S UNCHANGED ♦*• 



FMT I FCnOE(-ltO,F) 



AFTER ASGN AND VPQP. THE VALUE CN VS HAS BEEN STORED AT VE«-U& IN MEMUHV, 

SINCE THE SEGMENT HAS BEEN COMPLETED, THE IS t<ILL BE STEPPED AND 

LS MILL BE RESET TO THE BEGINNING SINCE THERE iS NO OVERFLOW. 

THIS STAGE CORRESPONDS TO ONE PASS THROUGH THE "FOR J" RANGE* WITH J«0. 

EXAMPLE 6-16: AT END OF MAIN SEGMENT, FIRST TIME THROUGH 





MENORV DUMP 




AOOR 
«PT 


CONTENTS 




RC*l LEN*05 




♦01 


VB-VPT AS«0C0 




♦02 


RANK«2 




♦03 


Rin«004 Dm«02 




♦04 


R«2»»002 0121-01 




SE 


RC«l LEN«05 




♦01 


VB»VE AB«000 




♦02 


RANK>2 




♦03 


RU>«004 Ul 11-04 




♦04 


R<2I=004 Di2l«01 



AOOR CONTENTS 



AOOR CCNTENTS 



VPT 

♦ CI 
♦02 

♦ 03 
♦04 
♦05 

♦ C6 

♦ 07 

♦oa 



VE 


RC»l 


♦ 01 





♦02 


1 


♦03 


I 


♦04 





♦05 


I 


♦06 





♦ 07 





♦08 


I 


♦09 


1 


♦ 10 





♦ 11 





♦ 12 





♦ 13 





♦ 14 





♦ 15 





♦ 16 






USu 
00 


OP 


VALUE 


LINK 


AUX 


S 1 o.s 11 


01 


RED 





08 1 




02 


s 


2 


1 




03 


FA 


QCO0E(VPT,4l 


19 1 




04 


FA 


0CODE(VPT,6l 


19 1 




05 


GOP 


SUB 


02 1 


0011 1 


06 


NIL 





1 




07 


OP 


PWR 


05 I 


0011 1 


09 


OP 


AOU 


07 1 


0011 1 


09 


SGV 


SCOUEtSEG.A.ll 


1 




IC 


S 


-2 


1 




11 


MIT 





1 




12 


NIL 





J 




13 


OP 


PWK 


13 1 


0001 1 


14 


S 


-1 


1 




15 


OP 


ADO 


02 1 


0001 t 


16 


OP 


MOD 


1 


JOOl 1 


17 


S 


O.OOOl 


1 




18 


OP 


GT 


02 1 


0001 1 


19 


A 


QCODElVEtin 


06 1 




20 


OP 


ASGN 


02 i 


0001 1 


21 


POP 





1 




22 


NLT 


OC ODE 11,1) 


01 1 




23 


NT 


gC0DE<6,2) 


I 




24 


NLT 


;)CaDE(itl) 


1 01 1 




25 


NLT 


aC0DEI3,U 


1 




— > 











ENTRIES FOR aTi8.<,.tar4 NOW HAVE REFCOS OF Oe ANU HAVE BEEN ADDED TO THE 
LINKED AVAILABILITY LSST» ALTHOUGH THIS SS NCjT SHOWN HERE. 
THE ENTRY gN THE VALUE ARRAY FOR E g AT VE^9 IN MEMORY? HAS BEEN 
CHANCED TO I ®V THE ASGN OPERATION, FHflS fcNlTftY SS £«2s0to 



THE MAIN SEGMENT WAS REPEATED 3 MORE TIMES IN THE SAME WAY AS SHOWN 
FUR THE FIRST PASS. AT THIS POINT, 3 MURE VALUES HAVE BEEN STOKED 
AND THE IS ENTRY CORRESPONDING TO THIS SEGMENT HAS BEEN EXHAUSIEO. 
THIS POINT CORRESPCNOS TO THE COMPLETION OF "FO* J". 

EXAMPLE 6-l6i REGISTERS AFTER NEXT THREF PASSES THROUGH SEGMENT 



EXAMPLE 6-a7s STATE OF M AFTER FIRST TIME THROUGH FHS: SEGMENT 



CO 
C5 



fcXAHPLE 6 — 6-MACHlNE 

REGiSTEK DUMP 

KEWIT = 3 lORG * FRtG ^ OOCOO FSASE = 00200 

REL CJRG LEN 0/E IS FN HkJ QP 

LS: «• + ♦ *■ '♦ — .-=4.-=-«.^». — .+ — --+ 

I 040 I COO 1 075 I C i I I I B I 00 I 
— > I 

EFFECTIVE AOOR = 0240 IN M 

TAG VALUE OP VALUE LINK AUX 

VSs ♦ ♦ ' — *- QSs* .-+-» «=.««».»«« — ,«^c=»4.««.«^4=««,^»««^ 

I FMT I FCOOE(-l,0,FI | -»> | 
— >l 

THE LAST FIGURE WAS THE END OF THE SEGMENT. THUSf IS WAS 
STEPPED. SINCE IT OVERFLOWED, IS AND LS W£RE POPPEO« 
OE-ACTIVATING THAT SEGMEI^T CHANGED CONTROL FROM THE E- TO THE 0-WACHIJME 
AND THEREFORE QI kAS RESET TO THE BEGINNING OF THE SEGMENT 
JUST COMPLETED. 

EXAMPLE 6-19: REGISTERS AT CCMPLETICN OF fc-MACHINE EVALUATION. 
MEMORY DUMP 



LEN^l? 



AOOR 


CONTENTS 


AOOR 


CONTENTS 


ADDR 


CONTENTS 


5PT 


RC=l LEN=05 


VPT 


RC-l LEN« 


09 Vfc 


RC:^1 


♦01 


VB^VPT AB«000 


♦ 01 





♦01 





♦02 


RANK=2 


♦ 02 





♦02 


i 


♦03 


R(ll«004 0(1)^02 


♦ 93 





♦ 03 


I 


♦04 


R<2)*002 0(2)=01 


♦ 04 


I 


♦04 


Q 






♦ 05 


1 


♦05 


I 


aE 


RC=l LEN=05 


♦ 06 





♦ 06 





♦01 


VB=VE AB=000 


♦07 


i 


♦ 07 





♦02 


RANKs2 


♦ C8 


1 


♦08 


1 


♦0 3 


R(1M004 0<n>04 






♦09 


1 


♦ 04 


R<2)'=004 D(2)=0l 






♦ 10 

♦ 11 

♦ 12 

♦ 13 

♦ 14 

♦ 15 

♦ 16 





1 



Q 

c 





NOTICE THAT THE VALUES AT VE^9f 10, 1 1 1 12 HAVE CHANGED FROM EXAMPLE 6-2. 
THESE CORRESPOND TO E(2|l, THE ENTIRE ROW OF E TO BE CALCULATED* 

EXAMPLE 6-20: MEMORY AT COMPLETION OF E-MACHINE EVALUATION 



APPENDIX A 
SUMMARY OF REGISTERS, ENCODINGS AND TAGS 

This appendix summarizes the uses of all machine registers and details the 
fields in the various stackSo In addition, the several encodings used as parametric 
functions in the design description are outlined. Because of the parametric nature 
of the design, not much will be said about field sizes except to indicate the range 
of the contents of a particular field or registero We assume that in any particular 
incarnation of such a machine, all the fields are ^^big enough^^ to contain their 
contentSo In the detailed algorithms of Appendix B, the registers are construed 
as arrays of scalars with some kind of encoding imposed upon the contents, if 
necessaryo While not completely rigorous, this approach serves to show how the 
machine works without having to explicitly encode and decode all references to 
registers at each stepo 

Ao Registers 

Ic LS (Location Counter Stack) 

Field Column 

Name Index Contents 

REL Relative location in segmento Generally points to the next 

instruction to be fetchedo 

ORG 1 Segment origin. For D-machine segments, this is relative to 

FBASE. In the E~machine, the effective address is +/LS'[L_r-l ;0,lj 
and in the D-machine it is FM5£'++/L6'[LJ-1 ;0 ,1 ] . 

LEN 2 Length of segmento For D-machine segments, this is in words, 

and for the E-machine, this is the number of QS entries for the 
segment. 

D/E 3 Segment mode. This field is for the D-machine and 1 for E- 

machine segments. 

IS 4 Iteration mark. Has value 1 if this segment is associated with 

an iteration in IS; otherwise it is 0. 
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FN 

NWT 
QP 



6 

7 



Function marko Has value 1 (else 0) if this is the main segment 
of an active function. 

NEWIT value, stacked when a new iteration is activated^ 

QS pointer o Used by index unit for expression indexed from 
QS rather than IS» (See Section E. ) 



2„ IS (Iteration Control Stack) 

Field Column 

Name Index ^ 



Contents 



CTR 

MAX 
DIR 

CH 

MRK 





1 

2 



Current iteration counto This value is always non-negative and 
varies between and the value in the MAX field, in the direction 
indicated by the DIR fieldo 

Maximum iteration count. 

Direction of count. (0 for positive, 1 for negative. ) If positive, 
then CTR is initialized to 0; otherwise it is initialized to MAX. 

Change. Used by STEPIS routine in main control cycle to mark 
all IS entries which have changed since the last cycle. 

Mark. Has value 1 for the outermost iteration of each nest. 
Otherwise, it is 0. (See ISMK register, below.) 



3. VS (Value Stack) 



Field 
Name 



Column 
Index 



Contents 



TAG 

VALUE 1 



Tag field. Identifies kind of entry in value field. 
Value. 



4o QS (Instruction Buffer) 



Field 
Name 



Column 
Index 



Contents 



OP 



VALUE 







E-machine operation code. The QS contains instructions deferred 
by the D- machine for later execution by the E-machine. Occas- 
sionally this field will contain a tag, such as XT, for an entry 
which is a temporary value for the EM rather than an executable 
instruction. 

Value* Contains the value in immediate instructions and the 
operand for others. 
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LINK 2 Link, This is a signed integer used to reference other instructions 

and entries in QSo It is taken relative to the QS index of the entry 
in which it is found. By keeping liiiks and segment origins relative 
in QS, all deferred code is relocatable, 

AUX 3 Access mask. Contains an encoding (MCODE) of the iteration 

indices to use in accessing an array expression. 

5. NT (Nametable) 

Field Column 

Name Index Contents 

INX Symbol index. Since NT is content-addressable, the value of 

INX must be carried with each entry. These indices (or names) 
may be assigned in any arbitrary way. There is no built-in 
restriction on their use, 

TAG 1 Tag. Same as tag field in VS. 

CONTENTS 2 Value, Same as in VS. 

6, M( Memory) 

In the APL machine, M is considered to be a vector of length MLENGTH of words 
which can be addressed between BOTM and TOPM, The particular encodings used 
in M are not specified except as necessary, e.g. , in instructions such as LDSEG, 
the M-entry containing the operand is in SCODE encoding. Otherwise, each scalar 
value is assumed to take up one machine word, as is each instruction. This is 
clearly inefficient in space utilization, and it would be expected that any real 
implementation would specify more reasonable and detailed encodings for various 
kinds of values. Nothing in the machine design is based on the word as the primary 
unit of memory in the machine, so there should be no problem in making such 
modifications. 
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7. Other Scalar-Valued Registers 



Register 
Name 



Contents 



LI 



FREG 

lORG 

FBASE 



NEWIT 



ISMK 



LS index. (All stack indices point to the next available entry 
in the stack. ) 



n 


IS index. 


VI 


VS index. 


QI 


QS index. 


NI 


NT index. 


BOTP 




TOPP 


POOL poi: 


ARRAVAIL 




D AAV AIL 


Pointers t 



Pointers to beginning of availabilily chains for M allocatiQno 

VS index of innermost active function marko When a function 
is activated, the previous values of FREG and lORG are stacked 
in VS in the function mark, and restored on return^ 

Index origin for innermost active function^ 

Function origin in Mo Points to beginning of the segment 
containing the innermost active fimctiono Upon exit from a 
function, FBASE is restored to point to the correct base from 
information in the stacked function marko 

Iteration tago Set to 1 at the beginning of a new nest of iterations, 
and used by the index imit to keep indexing straighto NEWTT is 
stacked in LS and restored from there each time a new iteration 
nest is activated* 

IS index of the marked entry closest to the top of the iteration 
stacko Used by lUo 



Bo Encodings 

The APXi machine makes use of a few specific encoding functions* These are 
used for encodings which could be expected to fit within a single machine word. 
Although this bias is built into the design, it is inessential to the basic ideas used 
in the design, and could be changed if necessary* 
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lo SCODE org, len, m . This is the encoding of a segment descriptor* 

m is or 1 depending on whether this segment is for the D-machine or the E-^machine, 
org is the beginning address and len is the length of the segment. The inverse 
(decoding) functions are SORG, SLEN, and SMODE, respectively. In the EM, if 
a segment descriptor is in QS, org is relative to its QS-index. 

2o JCODE len, org, s . This is the encoding for a J-vector descriptor. 
The inverse functions are JLEN, JORG, JS. 

3. XCODE a, b,c . Encoding used for various purposes in the E-machine. 
Generally, a and b are an index and its limit, respectively, c is always a single 

bit quantity. It is conceivable that the functions SCODE, JCODE, and XCODE 
might be identical in a particular implementation of the APL machine, as might 
their inverses. The inverse functions for XCODE are XI, X2, and X3, respectively. 

4. QCODE a, b . This encoding is used in constructing ICB's during EM 
executions. Each field is potentially as large as the machine's memory and might 
be signed. The decoding functions are Ql and Q2. 

5. MCODE mask . This is the encoding function which takes a logical 
vector which is an access mask for an array and encodes it for storage in the AUX 
field of QS. The inverse function is MXl. 

6. FCODE freg, iorg, name . This is the encoding used in function marks 
on VS. The inverses are Fl, F2, F3. 
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UT 


1 


1 


ST 


1 


1 


JT 


1 


1 



Co Tags 

This section summarizes the tags which can be used in VS and NT entrieSo 
Tag VS NT Meaning 

Undefined value o 

Scalar value. 

J-vectoro Such entries are moved to QS from VS almost 
immediatelyo 

DT 1 1 Descriptor array pointer. In VS means this is a result 

to be assigned to, while in NT, all array values have this 
tago As with JT, DT entries will be deferred to QS as soon 
as they are noticed. 

FDT 1 Similar to DT, except the array is to be fetched. Same 

note applies. 

Function descriptor pointer. 

Segment descriptor. 

Name pointer. This is an NT index. 

Function mark. 

Unused (so far) reduction accumulator. 

Encoded M- address. 



FT 





1 


SGT 


1 





NPT 


1 





FMT 


1 





RT 


1 





AT 


1 






- 142 - 



APPENDIX B 
A FUNCTIONAL DESCRIPTION OF THE E-MACHINE 

The functional description of the E- machine which follows is written in an 
informal dialect of APLo It differs from "standard'^ APL only in its sequence- 
controlling statementSo Instead of using branches, more sophisticated, and more 
easily understood, constructions are utilizedo These are summarized briefly below: 
!• BEGIN o • p END delimits a compound statement, as in ALGOlio 
2p Likewise, conditional statements and expressions of the form 
IF condition THEN • • o ELSE o . . 
are as in ALGOLo However, in this description, the condition part 
evaluates to 1 or 0, corresponding to TRUE or FALSE in ALGOL. 
3p The case construction, 
CASE n OF 
BEGIN 
SI 
S2 



Sk 
END 

chooses and executes the n — statement in the sequencco This description 
has omitted some BEGIN ^s and END' s in compotind statements within the 
CASE statement and substituted typographical groupingo Although this is 
not syntactically rigorous, it renders the description more readable« 
4p The REPEAT statement repeats its range indefinitely. Within a repeated 
statement, the CYCLE statement is used to resume the main (compound) 
statement from the beginning, and LEAVE aborts the innermost REPEAT. 



143 - 



n THE E'MACEINE -- A FUNCTIONAL DESCRIPTION 

fl MAIN CYCLE ROUTINE 
REPEAT 

"mqIe 

fl THIS IS THE CONTROL ROUTINE IN FIGURE 2, HOWEVER, 
n ONLY THOSE PARTS RELATED TO THE E -MACHINE ARE SHOWN. 
IF -CASTOG THEN 
BEGIN 

If /;5CLI-l;0]>L5CiI-l;2] THEN 

BEGIN PI TOP SEGMENT ON LS HAS OVERFLOWED 
li L5CLI-l;if]=l THEN 

BEGIN n ITERATION MAY RECYCLE 
LSZLI-1 iOl^O 
STEPIS 
NEWIT -f- 
IF STEPTOG THEN CYCLE 

MR 
n DEACTIVATE TOP SEGMENT AND TRY AGAIN 

LPOP 
CYCLE 

K ^ +7i5[LI-l;0,l] 

II ~QSiK;Ol€lA,IFA,IJ,ISC,IXL THM 
LS'[iI-l;0] -«- Z,5CiI-l;0] + l 
END 
CASTOG ^ 

« IF ACTIVE SEGMENT IS FOR D- MACHINE THEN ACTIVATE DM 
IF I5CLI-1;3]=0 THEN DMACHINE ELSE 
"cask DECODE QStK',01 OF fl GOES TO' LABELS BELOW 
MQIK « DELIMITS RANGE OF C^SE STATEMENT 
n ^LABELS* BELOW NAME E-MACHINE~ INTERPRETATION RULES 

S) VPUSH ST,QSiK;ll 

lA ) D <■ QSIK ;ll 
IF A) INK -f- GINX K 

QSlK;2,0'i ■*- ei, IF QSiKiQl=IA THEN A ELSE FA 

I -*■ 5 -«- 

T ■*- IF I5[LI-1;7] = TgEg NT.NLT ELSE QT,QLT 

(pINX') REPEAT 

J"-*- GET DEL D,I f\ A = DELHI FOR THIS ARRAY 
S ■*- S+R-^IF Tlo:\=NT THEN A>^ISlINXlIl ;ll ELSE 
QPUSH TlI=~l+pINXl,TQCdDE R,A) ,INXlIl,0 
I <■ I+l 
END 

QSlKiH -^ QCODE (GETVBASE D) ,S+GETABASE D 

ERASE D 

A) lU K 

FA) VPUSH IF QSIK;01=A THEN AT, QSlK -,11 
ELSE ST, FETCH QS[.K;ll 
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J) lUl K 

OP) EXECUTE QSLKill fl QSLKil'] ENCODES A SCALAR OP 

RED) VPUSH RT,0 

L5CLJ-1;0] -H K-^QSLK;2l 

DUP) IF K>VI THEN ERROR ELSE VPUSH VSZVI-K;1 

VXC) 11 VI<2 THEN ERROR ELSE 75[ 71-1 , 2 ; ]*-75C 7J- 2 , 1 ; ] 

POP) VPOP 

IJ) INK ^ GINX K 

S -^ (JORG QSLK;i:i) + IE 0=JS QSiK;ll THEN -lORG ELSE 

lORG + "1 + JLEN QSLKill 
QS\.K;1 -f- J,{XCODE 0,5.J5 D),INX,0 

IXL) C5[JC;0,2] -«- XL, GINX K 

XL) VPUSH ST, IF L5CII-1;7] = THEN ISLQSlKiU ',01 ELSE 
lORG + XI QSiQSlK',2l;l 

IRP) QSiKil -«- iVIL, 0,0,0 

IRB) ERASE QSiKiH 

QSLK%1 "r NIL, 0,0,0 

MIT) ISMK "<- II 
REPEAT 
BEGIN 

VI-^VI-1 

11 VS[.VI',Ol=SGT THIN LEAVE 

KkSK II VSLVIiOl^ST THEN ERROR 
IPUSH VSIVI; 1 1 , II=ISMK 
END 
LPUSH~0,iSORG VSlVI-lill),(SLEN VSlVI-liU) ,1 ,1 ,0 ,0 
NEWIT <- 1 

SGV) T ^ QSLKill ft RECALL THAT SEG DESCRS ARE RELATIVE 

VPUSH SGT,SCODE (K-SORG T) ,{SLEN T) ,SMODE T 

SG) LPUSHS K 

ISC) QSLKiO,2l -f SC,GINX K 

SO 2" f- ISLQSlKi2liZlsNEWITyQSLKi2l>ISMK 
IF T THEN LPUSHS K 
iZSE 2ErQSLK+l;0l€XS,XC THEN 

BEQli 

LSLLI-liOl ■*- LSLLI-liOl + 1 

S -t- K+l-QSLK+li2l 

A SET CHANGE BIT TO 

QSLSill -»- XCODE iXl QSLSill),(X2 QSLSill),0 

liU. 
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JMP) 11 {QSiKiOl=JMP)yi{QSlK\Ol€JO^JWO)sVSlVI''l;l']=Q) 

JO ) v(.QSlKiOl€Jl,JNl)^VSiVI'l;lli = l 

'^^ ) ZM^ LSiLI-liQl *■ K-^QSiK\2'\ 

JNO) IF QSlKloleJO^Jl TgEN VPOP 

JNl) 

CCY) T *■ K+QStKi2l 

QSiTill ^ XC0DE(1+X1 QSiTii:i),(X2 QSlTiH),! 
LSLLI'liOl ■*■ 

RPT) L5CLI-1;0] -<- 

LVE) LPOP 

CAS) II ~(7SCyi-l;0]=5T)A75C7J-l;l]€i55'[/f;2] THEl ERROR 
L5CLI-1;0] *■ K+QSlK;2l 
K "•- K+VSZVI-liU 
VPOP 
CASTOG •*■ 1 

XS) J *■ K-QSLKi2l 

I <- VSlVI-liH-IORG 
VPOP 

II (r<0)vj>Z2 QSiJill THEN ERROR 

EL,SE QSIJ%1'] ^ XCODE I.(J2 QSUiH),! 

XC) J ^ K'QSIK',2'} 

QSUill -- XCODE (XI QSlJill),(X2 QSlJ ;ll) ,1 

LXl) VPUSH ST, XI QSlK'QSLK;2lii:\ 

LX2) VPUSE ST,X2 QSlK-QSiKi2lil3 

SXl) T ir K-QSlKi2'\ 

QSlTil'^ ■»- XCODE F5C7I-1;1],(X2 QSlTill),! 

SX2) T *■ K-QSIK\21 

QSlTil'i *■ XCODE {XI QSLTill) ,VS\:VI-lil2,l 

ORG) VPUSH ST, I ORG 

END fk END CASE STATEMENT RANGE 

END n E'MACHINE INTERPRETATION RULES 
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n AUXILIARl FUNCTIONS FOR E-MACHINE 

V INX -«- GINX KiR 

fl INX IS A VECTOR OF QS OR IS INDICES TO ACCESS ARRAY, 
n HIGHEST COORDINATE NUMBER (I.E. FASTEST VARYING) FIRST 
R ^ IF LSlLI-li7l = TgEN II ELSl QS[.LSZLI-li72;2l 
INX «<- ^((Rp2)T2jLQSlKi3l')/ \R 
V 

V LPOP 

IF LI = IHEN ERROR l^SE LI •*- LI-1 
If LSI LI i til =1 THEN POP IS 
li L5CLI;5]=1 Tiii FNRET 
NEWIT -c LSI LI; 61 

B IF THIS CHANGES MODES THEE CLEAN OFF QS 
II LSLLIi3l>LSlLI'l',3l TSM 
REPEAT 
MQIN 

IF QI = LSZ LI 111 TgEg LEAVE ELSE QI ■*• QI-1 
li QSZQI ',01 € IFA,IA,RDT THEN ERASE QSlQI ',11 
END 
V 

V POPIS 

II -^ ISMK 
REPEAT 

MQ.ll 

ISMK ■*- ISMK-1 

II ISMK=~1 TglN LEAVE ELSE H ISLlSMKiHl=l THIN LEAVE 
END 
V 

V LPUSH V 

II LI = LI MAX TgEH ERROR 

LSZLI;\7l -<- T&lV), NEWIT, IF 0*"lfy TgEN "ifF ELSg. LSZLI-l',7l 

LI ■*- LI+1 

V LPUSHS K 

II Q=SMODE QSiK;ll TgEN ERROR 

LPUSH 0,(K-SORG QSlK ', ll ) ,(^SLEN QSlK',ll) ,1 ,0 ,0 ,CORR K 

V 
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V lUl K;TiS;E 

fl CALCULATE J -VECTOR ELEMENT IN FORM XCODEiCURR ,INCR,SN) 

T -<- LSLLI-li72 

S -<- (XI QSlKill),0 

11 T=0 THEN Pt IF THERE IS A CHANGE, USE NEW ITER VALUE 

RMll 

IF ISlQSiKi2:ii3lANEWITvQSLK;2:i>ISMK THEN 
~ S <- ISlQS\:Ki2l;0l,l 

MU. 

ELSE IF 1=X3 QSlT+QSlK',2lii:i THEN S ^ (.XI QSZT+KiH) ,1 
Li 5ClI=l THEN 
BEGIN 
~ T -^ X3 QSlKiH 

5C0] -<- IE T = THEN 5[0] ELSE -SlOl 
QSIK;13 * XCODE 5[0],{X2 eJU;!]),^ 
END 
VPUSH ST, Sio:i+X2 §5[X;1] 
V 

V lU K;IP;IQ;S;T;D 
n INDEX UNIT 

S *■ 

IQ -«- K+QSLKi2'] ft BEGINNING OF ICB FOR THIS ARRAY 
T *■ LSZLI-l',7l 
REPEAT 
MQIN 

IP ->- QS[.IQi2l+T 
IF T=0 THEN 

Ui.Q.li « THIS ARRAY INDEXED BY IS 
IF ISLIPi3^ANEWITvlP>ISMK THEN 
BEGIN 

If (rsCJP;o]=o)Ai5[jP:2]=o then 

S ^ S-Ql QSlIQiH 
Kk&E 

IF (I5[IP;0]=J5CIP;1])AI5[IP;2]=1 THEN 
S ^ S+Ql QSZlQiH 
KLSE IF I5CIP;2]=0 THEN 

~ S ^ S+Q2 QSLIO ;ll 
KL&K S ■<- S-Q2 QS£IQ;11 
END 
END 
KLSE 

BEGLK « THIS ARRAY INDEXED FROM QS 
~Tf 0=J3 QSlIP;ll THEN LEAVE ELSE 

MQ.LR 

"TT^ (Q2 QSlIQilDxXl QS[.IP',ll 
S *■ S+D-Ql QSriQil2 
QSlIQif] ^ QCODE D,Q2 QSZIQiH 

END 
END 
IF~QSlIQiO:ieILT,QLT THEN LEAVE ELSE IQ^IQ + 1 

MU. 
QSlK-.l'] <■ QCODE (Ql QSlK;i:i) ,S+Q2 QSIK;12 



- 148 - 



7 R <- FETCH X 

n X IS A Q- CODED ADDRESS OF FORM QCODEiVBASE ,INCR) 

R -*- Mll + iQl X)+Q2 X;l 
V 

V EXECUTE CODOP 

ft CODOP IS A DYADIC OR MONADIC SCALAR OPERATOR(ENCODED) 
ft EXECUTE DECODES CODOP ON THE ELEMENTS OF VS: 
ft 

ft II ISDIADIC CODOP THEN 
ft BEGIIi 

ft VSlVI-1',11 f- VSLVI-lill (DECODE CODOP) VSlVI-liU 

ft VPOP 

ft END 
ft ELSE 

ft VSiVI-l;ll ■*- (DECODE CODOP) VSlVI-i;ll 
V 

V STEPIS ; 1 1 INCH 

ft SrET THE ITERATION NEST IN IS 

ft SET STEPTOG <- IF DONE THEN ELSE 1 

STEPTOG *■ 

I ^ II 
REPEAT 

I f- I-l 

IF (I5[J;0]=0)AI5CI;2]=1 THEN 
BEGIN 

IF I5CI;4] rgSff LEAVE ELSE 
I5CI;0,3] ^ I5[I;l].l 
END 
ELSE IE (I5CJ;0]=I5CJ;1])AI5CI;2] = rffgi? 

BEQIK 

IF ISlIi^l TgEN LEAVE EL^E J5CI;0.3] -<- 0,1 
END 
ELSE 

MQIK 

STEPTOG *■ 1 
I5CI;3,0] *■ 1,I5[I;0] 

+ i£ I5[I;2]=0 TgffiV 1 Ei5E "l 
LEAVE 

EM 
MQ 

V i? -e- CO/?i? K 

R -^ IF QSZKi23=0 THEN EL§.E K - Q5[X;23 
V 

ft FCO] IS COUNT (SIGNED); 7[1] 15 MARK 

ft CA5E OF COUNT=0 CANNOT OCCUR (HANDLED BY D -MACHINE) 
MX ir "1+I7C0] 

II II =11 MAX THIN ERROR 

I^CII;] -*- (li no]<0 THEN MX ELSl 0) ,MX ,(VlOl<0) ,1 ,Vl2l 
II -^ II+l 
V 
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APPENDIX C 
EXPANSION OF D-MACHINE OPERATORS FOR E-MACfflNE 

This appendix shows how the D- machine expands complex primitives into 
deferred sequences of E-machine instructions* It is assumed that the constraints 
noted for each operator are met, and that all operands have been tested for domain, 
conformability, and so forth before being submitted for expansion. This is not 
an important constraint since, for example, the requirement that an operand be 
beatable can always be satisfied by explicitly evaluating an unbeatable operand to 
temporary space. 

Before the expansion of any of the dyadic operations, the value stack and the 
instruction buffer are as follows: 

VS QS 

OP VALUE LINK AUX 



o • o 



SGT • ^-< I Code for right operand mF 



SGT • ^-i jcode for left operand ml 



where ml and m2 are the access masks for the deferred expressions, found in the 
AUX field of QS. In the sequel, segments in QS are delimited graphically by braces 
and pointer or Greek letters are used to avoid confusion with explicit relative ad- 
dressing. 
1. GDF 

The operands deferred in QS must be simple array values. The operand of 
a GDF instruction is a dyadic scalar operator, OPR. Expansion produces the 
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following: 



VS 



QS 



OP VALUE LINK AUX 



SGT 



L^ 



Code for right operand ni2 ♦>. 



Code for left operand 



ml' 



GOP 
IRD 



OPR 

Tl 



1 



m3 



m3 



In the above, Tl points to a DA containing the result rank and dimension for the 
GDF. ml' is ra2 shifted left by the rank of the right operand, m3 is the logical 
or of ml' and m2 (i.e» , m3 ml' m2). Because of the requirement that the 
operands be simple array values, the segments in boxes each consist of a single 
IJ or IF A instruction. 
2. RED 

By the time an expansion is to be done, any necessary transposes on the 
reducee have been performed. The variable B has value 1 if the reducee is 
beatable and is otherwise. The "before" picture is: 

VS QS 



SGT 



Code for reducee 



The reduce operator is OPR, giving rise to the expansion below: 



VS 



ml 



OP QS 

OP VALUE LINK AUX 



SGT 



ai 
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RED 



Code for reducee 



OP OPR 

SGV orl 

S -len 

MIT 

IRD @T1 



ml 



ml 



B -1 m 



where len is the length of the reduction coordinate and Tl is a DA with the rank 
and dimensions of the result. 
3. DIOTA 

The ranking operation, corresponding to dyadic i, requires that the left 
argument be a simple vector array value. This is because this operand is evalxiated 
repeatedly during the E- machine execution of the following expansion. 



VS 



QS 
OP VALUE LINK AUX 



SGT 



r 



O • • • O O 


o o 




Code for right operand 


m.2 




" 



JMP 



r 



Code for left operand 



ml 



<ri<^ 



^< 



DUP 

OP 

JNl 

POP 

LVE 

OP 

ORG 

SGV 

S 

MTT 

VXC 

POP 

IRP 



NE 



ADD 

(Tl 

len 



len is tiie length of the left operand. It should be clear from working through the 
above expansion that it is simply a literal interpretation in E-machine code of the 
definition of the ranking operator. It is assumed that the D-machine wiU have 
checked for the case of an empty vector as either operand, producing the correct 
result automatically. If the rank of the result is 0, that is if the right operand is 
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a scalar, the above expansion is executed immediately by the E-machine* The 

IRP instruction is similar to IRD, except that it points to an instruction in QS 

which contains dimension information instead of referring to an explicitly-created 

DA. 

4o EPS 

Before expanding the membership operator, a check is made for the special 
cases of right-operand scalar or 1-element quantity. In these cases the operation 
done is A=B or A=(, B)[l], respectively. Similarly, if the left operand is scalar 
then A=B is done. Otherwise, the expansion is made in QS as below: 



VS 



QS 



SGT 



-{ 


OP VALUE LINK 


AUX 


. . . • o » • 


• • 






Code for right operand 


in2 








Code for left operand 


ml 



<T2 •< 



< 



K. 



RED 
DUP 

SG 
OP 
OP 

JNO 

LVE 

SGV 

S 

S 

• « 

S 

MIT 

VXC 

POP 

IRP 



7 •- 
2 



ai 
£Q 
OR 



CT2 
lenl 
len2 

... 

lenK 
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where lenl, len2, « o o , lenK dimension of right operand. As in the expansion for 
DIOTA, the expansion of EPS is a straightforward E-machine translation of the 
definition of the membership operator. 
5o SUBS 

Before the SUBS expansion takes place, the subscripts have been examined 
to see if they can be beaten into the subscriptee* If an expansion is needed, then 
there must be some subscripts lefto Before expansion, the registers contain: 



VS 



QS 



SGT 



Code for rightmost 
subscript 



mr 



SGT 
SGT 



Code for leftmost 
subscript 



Code for subscriptee 



ml 



mO 



The rank r of the subscriptee must be the same as the number of subscript 
expressions. The rank of the result is the sum of the ranks of the subscripts 
(counting empty subscripts as rank-l). Some of the SGT entries on the VS may 
be empty, that is of the form SCODE(SEG, NIL, 0). After expansion, the picture 
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has changed to: 



VS 



SGT 



r 



■< 



QS 
OP VALUE LINK AUX 



JMP 



Code for rightmost 
non-empty subscript 



Code for leftmost 
non-empty subscript 



al< I Code for subscriptee 



iS XT XCODE(0,11,1) 

o e e • • 

XT XCODE(0, lr,l) 
Calc subs 1 

XS — 



Calc subs r 
XS 

SG 0-1 

IRD @T1 



/3 




mr 



Where 11, 12, « • • , Ir is the dimension of the subscriptee, minus lo This field of 
the XT entries is used for checking purposes in the lU (see Section E)p jS is the 
QS index of the beginning of the XT back and @T1 is a DA with the rank and 
dimensions of the result, mr is the access mask of the result. The link field of 
p contains r, the rank of the subscriptee, which is used in the initialization of lA, 
IFA, IJ instructionSo The lines in QS marked ^'Calc subs k^^ are one of the 
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followii^: 

OP VALUE LINK AUX 



ISC SC0DE(SEG.K',1) m' 

IXL (T m' 



In the first case, the k — subscript is to be computed explicitly, which is done by 
activating SEG K^ one of the non-empty subscript segments on QSo In the second 
case, the segment that was stacked on VS for this subscript was empty, so the 
actual subscript used is the same as that which was controlling this coordinate 
from the outsidCo The mask m' in the AUX field specifies the index environmento 
Example 4 in this chapter shows a specific instance of an expansion caused by the 
SUBS operatoro 

The remaining operator expansions are similar to SUBS, in that they are all 
special cases of it, 
60 CMPRS 

The compressor (left operand) has been evaluated to a temporary space, if 
it was not there already, and checked to see if it contains only and 1 elementSo 
In addition, the number of l's,call it DIMl, has been counted and Vil, the index 
in V of the first non-0 value is known; call it XAo This process is unfortunately 
necessary since we must know the rank and dimension of the result before deferralo 
The same process must be applied to the expansion operatoro Unless the com- 
pressor falls into a special case which can be done immediately {UOo , scalar 1 
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or or vector of all I's or all O's) ttien the foUowii^ expansion is made: 
VS QS 



r 



SGT 



< 



OP VALUE 



LINK AUX 



JMP 6 

crl-j [ Code for compressee 



a2 1 1 Code for compressor 



X: XT xcode(0,XA,0) 

P: XT xcode(0, 11, 1) 

• o e • 

y: XT xcode(0,lk,l) 

e o o e • 

XT xcode(0,lr, 1) 

r ]XL 



JNO 

LXl 

OP 

OP 

JNO 

DUP 

LXl 

OP 

xs 
as ^ SG 
^ JO 
LXl 
OP 
SXl 
RPT 
DUP 
SXl 
LX2 
XS 
POP 
^LVE 
5: IXL 
XS 

l^ 
XC 

ixL 

XS 
SG 
IRD 



SUB 

SGN 



SUB 
2 

SUB 



as 



ai 

@T1 



1 
r •♦■ 



VC& 



ml 



mk' 



\ 










y 

y 
y 




X 
X 




2 
X 

y 

•4 1 





ml' 



mk' 



mr' 



mr 



^ 
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where li, . . o Ir are as in the SUBS expansion; ml* throtigh mr* are the masks for 
the individual subscripts with mk' being the mask for the compressed coordinatCo 
The first XT entry is used to hold XA and XL where XL is the last value of the 
external index for the compressed coordinate. The algorithm used is as follows: 
Algorithm for compression: We wish to find XT such that 

(U/[k]X)[. . o ;!;. . .]— X[o . . ;XT;. o o] 

Let XL be the last value of I from which the last XT was calculated. XA is the 

index of the first 1 in Uo Then, the QS expansion for compression calculates the 

new value of XT as a function of the new I and old XT and XL as follows: 

if_ 1=0 then 
begin 

XL— 
XT-XA 
end 
else 

repeat 
begin 

T— XXL-I 
if T=0 then leave 
repeat 
begin 

XT— XT-T 

if U[XT]-1 then leave 
end 
XL— XL-T 
end 

7o EXPND 

The EXPND operator is treated similarly to CMPRS. In particular, the 
expander (left operand) is checked to see that it is a logical quantity and the number 
of I's is compared to the length of the expansion coordinate. If the expandor falls 
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into one of the special cases (all ones, all zeros) the resiilt is calculated immediately. 
Otherwise, the QS expansion that follows is made to implement the expansion 
algorithm below: 

Let R be (U/[K]X)[e • • ;I;o . •] . Then we want to find LX such that R---if utlj^O 
then else X[, o o ;LX;o . .] . LU is the index of the last foimd 1 in U and LX is the 

J.TL 

corresponding X index (on the K — coordinate), 
jf U[l]=0 then R^O else 
begin 

repeat 
begin 

T^xi-LU 
if T=0 then leave 
repeat 
begin 

LU— LU+T 

if u[LU] =1 then leave 
end 
LX*-LX+T 
end comment main repeat; 
R"*"X[_. • • ;IjX;. • •] 
end 
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vs 



QS 
OP VALUE 



LINK AUX 



SGT 



r 



< 



JMP 



(Tl { I Code for expandee 



0'2 I I Code for expandor 



aS: < 



V 



m2 



mk» 



6: 


XT 


xcode(LU, lu, 1) 


1 




P: 


XT 


xcode(0, 

e o o 


11, 


1) 


r 


y: 


XT 


xcode(0, 


Ik, 


1) 




A: 


e • 

XT 


• • o 

xcode(0, 


Ir, 


1) 




/- 


LXl 
IXL 
OP 


SUB 






6 




OP 
JNO 


SGN 












DUP 








1-*1 






LXl 








6 






OP 


ADD 












xs 








6 






SG 


a2 






6 






JO 














LXl 








r 




OP 


ADD 










XS 








r 




RPT 












POP 








<-^ 




rxL 












XS 








§ 




o o 


o e • 








e: 


IXL 
XS 








X 


V. 


SG 

SG 

CAS 

S 

SG 


al 

(T2 


as 






P 

2 




IRD 


a 











mk' 



ml' 



mr' 



mr 



Note that the sequence of IXL and XS instructions starting at 6 does not contain a 
reference to the k — subscript position as this has already been computed at the 
beginning of the segment activated by the CAS instruction^ Also, in the above, the 
quantity £u in the X2 field of the pseudo-iteration stack at is the length of vector 
Uj less lo 



- 160- 



8. ROT 

Rotation is a special case of subscripting defined as follows; 
1£ N is a scalar, then R^N<^LKliM means for each L ELT ipM 

RLl/Ll^Mi;/iiK-l)^L) , aORG HQM)Ln\(N- IORG ) + iipM)LKl) ,ML1 
If N is an integer array 'wiiiiQN<-^(K^\QpM)/pM then 

Rl;/Ll^ML;/i{K''l)^L)A lORG HpM)in\iNL;/Ln- IORG )+i(iQM)in),KiLl 
where L^'^-^iK^iQQM)/L. 

Thus the expansion for ROT in QS is the same as for a general subscript with all 
but the K — coordinate being IXL, XS pairs and the K— coordinate being computed 
according to the above definition. The explicit expansion will be omitted since it 
is similar to what has already been shown. 
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APPENDIX D 
POWERS OF 2 



n -n 

2 n 2 

1 1.0 

2 1 0.5 
4 2 0.25 
8 3 0.125 

16 4 0.062 5 

32 5 0.031 25 

64 6 0.015 625 

128 7 0.007 812 5 

256 8 0.003 906 25 

512 9 0.001 953 125 

1 024 10 0.000 976 562 5 

2 048 11 0.000 488 281 25 
4 0% 12 0.000 244 140 625 

8 192 13 0.000 122 070 312 5 

16 384 14 0.000 061 035 156 25 

32 768 15 O.OOO 030 517 578 125 

65 536 16 0.000 015 258 789 062 5 

131 072 17 0.000 007 629 394 531 25 

262 144 18 0.000 003 814 697 265 625 

524 288 19 0.000 001 907 348 632 812 5 

1 048 576 20 0.000 000 953 674 316 406 25 

2 097 152 21 0.000 000 476 837 158 203 125 

4 194 304 22 0000 000 238 418 579 101 562 5 

8 388 608 23 0.000 000 119 209 289 550 781 25 

16 777 216 24 0.000 000 059 604 644 775 390 625 

33 554 432 25 0.000 000 029 802 322 387 695 312 5 

67 108 864 26 0.000 000 014 901 161 193 847 656 25 

134 217 728 27 0000 000 007 450 580 596 923 828 125 

268 435 456 28 0.000 000 003 725 290 298 461 914 062 5 

536 870 912 29 0.000 000 001 862 645 149 230 957 031 25 

1 073 741 824 30 0.000 000 000 931 322 574 615 478 515 625 

2 147 483 648 31 0.000 000 000 465 661 287 307 739 257 812 5 
4 294 967 296 32 0.000 000 000 232 830 643 653 869 628 906 25 
8 589 934 592 33 0.000 000 000 116 415 321 826 934 814 453 125 

17 179 869 184 34 0.000 000 000 058 207 660 913 467 407 226 562 5 

34 359 738 368 35 0.000 000 000 029 103 830 456 733 703 613 281 25 

68 719 476 736 36 0.000 000 000 014 551 915 228 366 851 806 640 625 

137 438 953 472 37 0.000 000 000 007 275 957 614 183 425 903 320 312 5 

274 877 906 944 38 0.000 000 000 003 637 978 807 091 712 951 660 156 25 

549 755 813 888 39 0000 000 000 001 818 989 403 545 856 475 830 078 125 

1 099 511 627 776 40 0.000 000 000 000 909 494 701 772 928 237 915 039 062 5 

2 199 023 255 552 41 0.000 000 000 000 454 747 350 886 464 118 957 519 531 25 
4 398 046 511 104 42 0.000 000 000 000 227 373 675 443 232 059 478 759 765 625 

8 796 093 022 208 43 0.000 000 000 000 113 686 837 721 616 029 739 379 882 812 5 

17 592 186 044 416 44 0.000 000 000 000 056 843 418 860 808 014 869 689 941 406 25 

35 184 372 088 832 45 0000 000 000 000 028 421 709 430 404 007 434 844 970 703 125 

70 368 744 177 664 46 0.000 000 000 000 014 210 854 715 202 003 717 422 485 351 562 5 

140 737 488 355 328 47 0.000 000 000 000 007 105 427 357 601 001 858 711 242 675 781 25 

281 474 976 710 656 48 0.000 000 000 000 003 552 713 678 800 500 929 355 621 337 890 625 

562 949 953 421 312 49 0.000 000 000 000 001 776 356 839 400 250 464 677 810 668 945 312 5 

1 125 899 906 842 624 50 0.000 000 000 000 000 888 178 419 700 125 232 338 905 334 472 656 25 

2 251 799 813 685 248 51 0.000 000 000 000 000 444 089 209 850 062 616 169 452 667 236 328 125 

4 503 599 627 370 496 52 0.000 000 000 000 000 222 044 604 925 031 308 084 726 333 618 164 062 5 

9 007 199 254 740 992 53 0.000 000 000 000 000 111 022 302 462 515 654 042 363 166 809 082 031 25 

18 014 398 509 481 984 54 0.000 000 000 000 000 055 511 151 231 257 827 021 181 583 404 541 015 625 

36 028 797 018 963 968 55 0.000 000 000 000 000 027 755 575 615 628 913 510 590 791 702 270 5u7 812 5 

72 057 594 037 927 936 56 0.000 000 000 000 000 013 877 787 807 814 456 755 295 395 851 135 253 906 25 

144 115 188 075 855 872 57 0.000 000 OOO 000 000 006 938 893 903 907 228 377 647 697 925 567 626 953 125 

288 230 376 151 711 744 58 0.000 000 000 000 000 003 469 446 951 953 614 188 823 848 962 783 813 476 562 5 

576 460 752 303 423 488 59 0.000 000 000 000 000 001 734 723 475 976 807 094 411 924 481 391 906 738 281 25 

1 152 921 504 606 846 976 60 0.000 000 000 000 000 000 867 361 737 988 403 547 205 962 240 695 953 369 140 625 

2 305 843 009 213 693 952 61 0.000 000 000 000 000 000 433 680 868 994 201 773 602 981 120 347 976 684 570 312 5 
4 611 686 018 427 387 904 62 0.000 000 000 000 000 000 216 840 434 497 100 886 801 490 560 173 988 342 285 156 25 
9 223 372 036 854 775 808 63 0.000 000 000 000 000 000 108 420 217 248 550 443 400 745 280 086 994 171 142 578 125 

18 446 744 073 709 551 616 64 0.000 000 000 000 000 000 054 210 108 624 275 221 700 372 640 043 497 085 571 289 062 5 

36 893 488 147 419 103 232 65 0.000 000 000 000 000 000 027 105 054 312 137 610 850 186 320 021 748 542 785 644 531 25 

73 786 976 294 838 206 464 66 0.000 000 000 000 000 000 013 552 527 156 088 805 425 093 160 010 874 271 392 822 265 625 

147 573 952 589 676 412 928 67 0.000 000 000 000 000 000 006 776 263 578 034 402 712 546 580 005 437 135 696 411 132 812 5 

295 147 905 179 352 825 856 68 0.000 000 000 000 000 000 003 388 131 789 017 201 356 273 290 002 718 567 848 205 566 406 25 

590 295 810 358 705 651 712 69 0000 000 000 000 000 000 001 694 065 894 508 600 678 136 645 001 359 283 924 102 783 203 125 

1 180 591 620 717 411 303 424 70 0.000 000 000 000 000 000 000 847 032 947 254 300 339 068 322 500 679 641 962 051 391 601 562 5 

2 361 183 241 434 822 606 848 71 0.000 000 000 000 000 000 OOO 423 516 473 627 150 169 534 161 250 339 820 981 025 695 800 781 25 
4 722 366 482 869 645 213 696 72 0.000 000 000 000 000 000 000 211 758 236 813 575 084 767 080 625 169 910 490 512 847 900 390 625 
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CHAPTER V 
EVALUATION 

In this chapter we examine the design for an APL machine proposed in 



Chapter IV and compare its performance to more conventional architectureSo 
This is done by showing that the APLM is more efficient in its use of memory 
than a less sophisticated computer doing the same tasko 
Ap Rationale 

In Chapter IH, a number of design goals for the APLM were stated: 

1, Machine language should be ^'close^' to APLo 

2o Machine should be general, flexible o 

3o Machine should do as much as possible automatically, 

4. Machine should expend effort proportional to the complexity of its tasko 

5o Design should be elegant, clean, perspicuouSo 

60 Machine should be efficient« In particular, it should be parsimonious of 
memory allocation and accessingo 
We can dispose of some of these in short order. To begin with, goals 1,3, and 
4 have obviously been satisfiedo Since the machine designed implements APL, to 
goal 2 we can reply that the machine is general and flexible at least to the extent 
that APL as a language is general and flexible o For example, even though the 
APLM does not include all of the LISP primitives, if it is easy to write a LISP 
interpreter in APL, then the machine should be able to handle them with ease. 

Although I believe that the goal of elegance has been satisfied, this is not the 
place to make such judgements, nor am I the one to make them. This particular 
aspect will have to be decided by less prejudiced readers* A seventh, imstated 
goal is that the design should indeed work. It should be clear to the reader who 
has reached this point that the basic machine structure proposed is in fact sound 
and that an APL machine as described will produce correct answers, 
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This leaves the question of efficiency to be considered. Because we have not 
detailed a complete machine, traditional measures such as encoding efficiencies 
of comparisons of cycle times camiot be used, A major emphasis throughout this 
work has been to minimize the necessity for temporary storage in expression 
evaluation and simultaneously to minimize memory accessing. While these prob- 
lems are often of marginal importance in a conventional design, they are qxiite 
significant in an APL machine, since operands are generally arrays. Thus a 
temporary store is no longer a single word, but is potentially an array of indefinite 
size. Similarly, the conventional problem of saving a single fetch where a quantity 
might be in a register, becomes the problem of saving 1000 fetches for an array 
operand. 

The remainder of this chapter is dedicated to the evaluation of machine ef- 
ficiency. We take an analytic approach here, but cannot hope to have a simple 
analytic model of the machine per se which would give clean, closed- form quanti- 
tative data about the APLM. Instead, the analysis compares the performance of 
the APLM to a fictitious ^ 'naive machine, '^ which is simply a straightforward 
interpreter of the semantics of APL. 

The next section discusses the naive machine (NM) and outlines the assumptions 
upon which the comparisons will be based. In the sequel, we will compare the two 
machines by looking at the number of individual fetches, stores, operations, and 
temporary stores needed to do a particular task. Different tasks will be examined 
with this in mind. At the end of the chapter, these results will be summarized 
together with some conclusions. 

B. The Naive Machine 

Although the APL machine proposed in Chapter IV has never been implemented, 
there exist concrete examples of the naive machine. These include APL\7090 
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(Abrams [l966]), APL\1130 (Berry [l968]),and APL\360 (FalkofE and Iverson 
[l968] ; Pakin [l968])a The main feature which distinguishes the NM from the 
APLM is that the APLM defers many computations while the naive machine 
evaluates each subexpression immediately after its operands have been evaluated^ 
The APLM, by contrast, does some of its evaluations immediately (e.go , scalar 
results), defers some indefinitely (by drag- along), and does still others in a non- 
direct way (Co go , beating). 

The following list of assumptions clarifies in more detail the differences 
between the APXiM designed in this work and our "standard" naive machine as 
used in the rest of this chapter. 

lo The naive machine uses the same representation for arrays as does 
the APL machine. If the naive machine is APL\360, then this is approximately 
true. In fact, APlf\360 does not separate DA's from value parts in array rep- 
resentations. On the other hand, APIi\360 represents scalars as rank-0 arrays, 
and is thus more inefficient in its handling of scalar values. We assume here 
that the NM keeps scalar values in a value stack as does the APLM. We have 
also (generously) assumed that the NM uses the J-vector representation for 
interval vectors. In general, these assumptions cast the naive machine in a 
better light than any current implementation of APL. 

2, The naive machine generates a result value whenever an operator is 
found and its operands are evaluated. (This is exactly the way APL\360 works. ) 
Further, we assume that the NM will use temporary space allocated to one of 
its operands for the result, if possible; e. g. , if the expression A+B is to be 
evaluated, a new temporary space must be found to accommodate the result. 
However, if the expression is A+B+C; the subexpression Bi-C will be evaluated 
first causing the creation of a temporary t which can then be used as the result 
destination for the value of A+t« 
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3o In an assignment to a variable, as in A- — expression, the naive machine 
performs the assignment simply by storing a pointer to the temporary for the 
evaluated expression in the nametable entry for A. Again, this is consistent with 
the fimctioning of APL\360, 

4o Each operation in either the NM or the APLM reqmres a fixed amount 
of overhead (Co go, rank checking, domain checking, space allocation, setup, 
drag- along, etc 0)0 An analysis of the instructions for both machines shows that 
these processes take approximately the same effort in both machineSo Since 
there is no way to compare this effort with the memory usage measures discussed 
here, it will be omitted. For a single statement, this overhead appears as a 
linear additive term* 

5o Since scalars are kept in the value stack in both machines and since the 
VS mechanism is not specified (Cogo , it could be a hard- wired stack, or a fast 
scratchpad memory, or it could be kept in memory with other array values), all 
scalar fetches and stores will be ignoredo The effort to evaluate array expressions 
always dominates the effort for scalar expressions. 

6. There are no distinctions made between data types in the APL machine« 
We thus assume that both the APLM and the NM use the same representation for 
individual data elements. 

7. All scalar operations take the same amount of time to perform. That is, 
an add or a mtdtiply will each be counted as a single operation. 

8. Finally, it is assumed that both the naive machine and the APL machine 
are implemented in similar technologies so that the cost of memory accesses, 
storage allocations, and operations are the same for both machines. 
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Cp Analysis of Drag- Along And Beating 

To begin the analysis, let us look at a subset of the operations of APL and 
derive some analytic results comparing the APLM and the NM, The set to be 
considered is 

lo Selection operations 

2o Monadic and dyadic scalar arithmetic operations 

3o Inner products 

4o Reductions of the above (this includes outer products) 

5o Assignments of above to imconditioned variables or to variables conditioned 
by selection operators o 
We consider only those expressions which are array-valued, as scalar expressions 
are done similarly in both machineSo Each operation requires the machine evalu- 
ating it to do a certain amount of work, summarized in Table 1 below. Tables 
2 A and 2B summarize the ''effort^' required to do these manipulationSo 

In Table 2, some of the entries contain conditional terms or factors. These 
accoxmt for the different possible initial conditions when a subexpression is evalu- 
ated. Also, notice that in Table 2B, some of the entries contain references to the 
functions DOF, DOS, and DOO. These are functions which, given a deferred 
expression as argument, return as values the number of fetches, stores, and 
operations, respectively, necessary to evaluate the expression. Thus, for the 
APL machine. Table 2B does not tell the whole story; we must also take into 
account the efforts to evaluate the final deferred expression (by the E- machine). 
Hence, it is necessary to give detailed definitions of the DOF, DOS, and DOO 
fxmctionSo 
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TABLE 1 
Steps in Evaliiation of APL Operators 



NAIVE MACHINE 




APL MACHINE 


Ac Selection Operators 






Ip Check rank, domain of operandSo 


1- 


Check rank, domain of operands. 


2o Get space for result DA, value. 


2. 


Get space for result DA (if operand 
is a variable). 


3o Set up DA, M-headerSo 


3, 


Set up DAo 


4. Set up copy operationo 


4, 


Adjust VS, QS, 


5. Do copy operationo 






6o Adjust VSo 






Bo Monadic Scalar Operators 






1. Get space for result DA, value 


lo 


Defer operation to QS, 


(only if operand is a variable). 






2o Set up DA, M- headers if space 


2, 


Adjust VS, QS. 


was gotten in step 1*, 






3o Do the operation. 






4o Adjust VS. 






Co Dyadic Scalar Operators 






1. Check rank, dimensions of 


1, 


Check rank, dimensions of operands. 


operands. 






2. Get space for result DA, value 


2, 


If one operand is a scalar, move it 


(only if both operands are 




to QS, 


variables)o 






3, Set up DA, M- headers if space 


3, 


Defer operation to QS, 


was gotten in step 2. 






4o Do the operation. 


4, 


Adjust VS, QS. 


5, Adjust VSo 






Do Outer Product 






1, Get space for result DA, value. 


1. 


If operands are deferred subexpres- 
sions, then evaluate them to temp sps 


2o Set up DA, M-headers, 


2. 


Get space for result DA. 


3. Do the operation. 


3, 


Set up DA, 


4, Adjust VS. 


4. 


Defer operation to QS. 




5, 


Adjust VS, QS. 
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Table 1 (cont. ). 





NAIVE MACHINE 




APL MACHINE 


E. RediKJtion 






1. 


Get space for result DA, value. 


1. 


Get space for result DA. 


2. 


Set up DA, M-headers. 


2. 


If reduction coordinate is other 
than the last, tiien do appropriate 
transpose. 


3. 


Do the reduction. 


3. 


Set up DA. 


4. 


Adjust VS. 


4. 


Defer operation to QS. 






5. 


Adjust VS, QS. 


F. Assignment to Simple Variable 






1. 


If right-hand side is a temp then 


1. 


If right-hand side is a temp then 




go to step 6, otherwise do steps 




go to step 6, else proceed. 




2 through 7. 






2. 


Get space for DA, value. 


2. 


If the LHS* variable is already 
defined and is of the correct size 
and does not appear permuted as 
an operand in the deferred RHS 
then go to step 5, 


3. 


Set up DA, M-headers. 


3. 


Get space for DA, value of LHS. 


4. 


Set up copy operation. 


4. 


Set up DA and M-headers. 


5. 


Do copy operation. 


5. 


Defer operation in QS, 


6. 


Adjust VS. 


6. 


Adjust VS, QS. 


7. 


Adjust Nametable. 


7. 


Adjust Nametable. 


G. Assignment to a Selected Variable 






1. 


Check dimensions of LHS, RHS. 


1. 


Check dimensions of LHS, RHS. 


2. 


Set up copy operation. 


2. 


If RHS contains deferred instances 
of LHS variable which are permuted 
differently than LHS, then proceed 
else go to step 6. 


3. 


Do copy operation. 


3. 


Get space for DA, value of RHS, 


4. 


Adjust VS. 


4. 


Set up DA, M-headers, 






5. 


Evaluate RHS to this temp. 






6. 


Defer selected assignment to QS. 






7. 


Adjust VS, QS. 



LHS and RHS refer to the left-hand side 
and right-hand side of an assignment 
arrow, respectively. 



- 169 - 



TABLE 2A 
Summary of Effort to Evaluate Operators - NAIVE MACHINE 



o 



OPERATOR 


FETCHES 


STORES 


TEMPS 


OPERATIONS 


SELECTION 

(R IS: sel ^) 


x/pi? 


4+(ppi?)+x/pi? 


Plx(4t(ppP)+x/pi5 





SCALAR MONADIC 
(R IS: OP S) 


x/pi? 


(Plx(i++ppi?))+x/pi? 


Plx(4t(ppP)+x/pP) 


x/pP 


SCALAR DYADIC 
(R IS: & 0P.¥) 


Nlxx/pR 


(P2x(4+ppi?))+x/pi? 


P2x(i++(ppP)+x/pi?) 


x/pP 


OUTER PRODUCT 
(i? IS: S o.OPSf) 


(x/p S)+x/pR 


'4+(ppi?)+x/pi? 


4+(ppP)tx/pi? 


x/pP 


REDUCTION 

(i? IS: OP/IK'} S) 


x/pS 


4+(ppi?)+x/pi? 


^+(ppR)+x/pR 


x/p^ 


ASSIGNMENT 


Plxx/p^ 


Plx(i++(pp ,f)+x/p ^) 


Plx(i4+(pp<g')+x/p S) 





ASSIGNMENT 
(sel A)^S 


x/p sel A 


x/p sel A 









Notes : PI -»— ^ if <f is a variable then 1 else 0, P2 - — ^ if S and ^ axe both variables then 1 else , 
Nl- — - if <f and ^are both arrays then 2 else lo 



TABLE 2B 
Summary of EJEEort to Evaluate Operators - APL MACHINE 



I 

M 

-si 



OPERATOR 


FETCHES 


STORES 


TEMPS 


OPERATIONS 


SELECTION 
(i? IS: sel S) 





Nlx(3i-ppR) 


ff2x(3+ppP) 





SCALAR MONADIC 
(i? IS: OP S) 














SCALAR DYADIC 
(R IS:SOPgr) 














OUTER PRODUCT 
(R IS: &°.0P3^ 


{PlxDOF(S})HP2^DOF(01) 


3+(ppi?)+(PlxD05(^)) 
HP2xD0Si^) 


3+ppP 


iPl^DOOiS)) 
t(P2xP00(^)) 


REDUCTION 

{R IS: OP/inS) 





3+(ppi?)+P3xMx(4+ppi?) 


3+(ppi?)+P3xil/lx(3+ppi?) 





ASSIGNMENT 
A^S 





P4x(4+ pp,f) 


P4x(4+(pp^)+x/p^) 





ASSIGNMENT 
(sel A)^& 


P5xnOF(^) 


P5x(rOS(«g')+4+(pp^)+x/p^) 


P5x(4+(pp^)+x/p#) 


PSxDOOi^) 



NOTES : N1-- 
Pl*- 
P3— 
P5— 



-Number of array opnds in <f N2- 

-if S contains deferred operators then 1 else P2 - 

"if K7»^r/Lpp^ ttien 1 else P4^ 
► if <?must be evaluated first then 1 else 



- Number of opnds with reference coimt > 1 
if «^contains deferred operators then 1 else 
if eg' is a temp or A is defined and of correct 
size and there are no indexing conflicts 
then else 1 



For the set of expressions containing only selection operations, scalar 
arithmetic operations, outer products, reductions, and assignment, it is relatively 
simple to specify the DOF, DOS, and DOO functions. Recall that in the APL 
machine, expressions are deferred in QS, which contains an operation code and 
an access mask for each entry. Let the function OP(I) be the operation code for 
QS[l;] and MASK(I) have as its value the access mask in the AUX field of QS[l;], 
Finally, for a given expression in QS, let RR be the dimension of the final result. 
For each QS entry whose opcode is IFA, lA, OP, or GOP define the fimction 
D(I) whose value is a dimension vector as follows: if the entry is not within a 
reduce segment then D(I) is RRo Otherwise catenate an element with the length 
of each reduction coordinate; the innermost reduction corresponds to the last 
element of D(I)o Thus, D(I) is the vector of limits of the iteration stack which 
are active when instruction QSp!;] is executed by the E-machine, The idea here is 
that D(I) represents the indexing environment of QS[l;]. If N(I) is the index of the 
rightmost 1 in MASK(I) (that is, N(I)- — -r/(MASK(I))/ipMASK(I)), then the following 
algorithm calculates the desired fxmctions: 
RF— RS— RO— 

I-^ — starting addr of deferred expression in QS 
repeat 
begin 

if OP(I) = IFA ttien RF — RF + x/N(I) tD(I) 
else if OP(I) = IA then RS— RS+x/N(I)tD(I) 
else if OP(l) € OP, GOP tiien RO — RS+ x/N(I) t D(I) 
I— I+l 

if I > segment ending addr then leave 
end 
Then DOF(^)- — -RF; DOS(<?)— RS; DOO(<?)— RO. 
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Do Example — A Simple Subclass of Expressions 

Since the input to either the naive machine or the APL Machine may be any 
arbitrary expression, it is difficult to produce a closed- form comparison of the 
performance of the twoo However, we can look in detail at a simple subset of 
expressions and obtain some estimates on how the two machines compare. 
Consider the set of expressions of the form A^<f, where S'is an expression con- 
taining only array- shaped operands combined by scalar arithmetic operators and 
selection operators. As an aid to the analysis, construct the tree corresponding 
to the expression <f , and number all the nodes corresponding to operators. Then, 
construct vectors RR, RD^ TY, TV, Nl arid/1/2 as follows: 

For each node I, representing i?^^^/!^ T^S\ where S^ is the subexpression 
rooted at node I, 

RDL II^x/qRESULT (Result Dimension of node I) 

RRin^p qRESULT (Result Rank of node I) 

TYlIl^ if operator is a select then 1 else if monadic then 1 else 2 

TKIl^ if all sons of node I are variable names then 1 else 

Nllll^- number of leaves in the subtree of node I 

N2ll']<- number of leaves in the subtree of node I accessible through a path 

not including a select operation. 
Finally, let R be the number of array operands in S 

M be the number of monadic scalar operators in S (i. e. , +/l=ri) 
N be the number of dyadic scalar operators in S (i. e. , +/2=TY) 
S be the number of selection operators in S' (i.e. , +/"l=!ri) 

Z be the number of elements in<? (i»e. , x/p^) 

Y be the rank of S (i. e. , qqS) 

P be: if APLM must get space for A then 1 else 0. 
Note that in a well-formed expression i\/=i?-i. 
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Then, from Tables 2 A and 2B, and the definitions of DOO, DOS, and DOF, 
we see that the effort for each machine to evaluate <^ is as follows: 
NAIVE MACHINE 

fetches: +/RDy^\TY 

stores: (+/i?D)++/( (-l=TY)v!ri^A(i<rj) )/(4+i?i?) 

temps: +/TV/{ ^-^RR+RD) 

operations: ^/{1<TY)/RD 
APL MACHINE 

fetches: Ry<Z 

stores: Z+(Px(4+J) )++/('*l=:ry)/Mx(3+i?i?) 

temps: (Px(i+-Hj+Z))++/(""l=5:'7)/il/2x(3+i?i?) 

operations: +/(i<tj)/Z 
In general, each formula above is the sum of the relevant entries in Tables 2A 
or 2 Bo As the fetch formulas are obvious, we show the derivation of the store 
count for the NMo First, each operator in eg' calculates a result which must be 
stored immediately which gives the term +/i?Z?. Also, temporary space must be 
allocated for selection operations and those cases of scalar operators in which 
one of the operands is not itself a temporary. In such a case, another 
4+ (result-rank) words mustbe stored. (All but one of these is for the new DA; 
the other is for the header word for the value array* ) The result ranks of the 
operations in S are in the vector RR^ Thus, the compression selects those 
elements of 4ti?i? which correspond to the conditions just stated. In particular, 
( 1-TI) is a vector having a one for each selection operator and TV/\{l<TI) has 
a one for each monadic or dyadic scalar operator whose evaluation requires 
temporary space to be allocatedp The sum of these terms gives the formula 
shown; the other formulas are derived similarly. 
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We can form the ratios of the corresponding quantities for each machine and 
attempt to get some estimate of their valueSo RF , the ratio of fetches in the naive 
machine to fetches in the APL machlne,is given by: 

^ Zi+niZ ^ MM1I12MU !Sial2i£b2 because s.R-1 

Thus, RF > 2+ ^^±1=^ 
R 

Hence, for fetches, the APLM does at least twice as well as the NM if there are 

at least two monadic or select operators. The worst case is whenM ox S orN 

is 1 and the rest are 0, in which case the ratio is 1. The above also shows that 

the ratio increases (without bound) in proportion to the nimiber of monadic and 

select operators in the expression ^. 

The ratio of stores for the two machines, RS ^ is: 

Z+(Px(i|.+J))++/( l=rj)/Mx(3+i?i?) 

> (ZxoRD)i-i-/(Ci=TY)yTVA(l<TI) ) /ik-i-RR) 
Z+(Px(4+y) )++/("! =!rj)/Mx(3+i?i?) 

. . (Px(4tJ))++/( l^TD/Nlxid-^-RR) 
Z 

(SINCE pRD ^ M+ilZ+S) 
But the numerators of the two fractions with denominator Z are bounded, 
while Z can increase without boimds. Thus for large Z , 

RSp:iM-\-N+S 
That is, in expressions in which the size of the operand arrays is large (i, e. , at 
least as many elements as there are operators) the NM requires more stores 
than the APLM, approximately in proportion to the number of operators in the 
expression. 
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In the case of temporary storage allocated, the ratio, RT^ is: 

j.n. ^ ±/IVliME2±EEl 

(Px(4+J+Z ))++/( l=ri)//l/2x(3+i?i?) 



(4+J+Z)++/("l=TJ)//l/2x(3+i?fl) 



-^/TV 



+/( 1=TY)/N2x(3-^RR) 



Again, the lower boimd is greater than 1, since (+/r7)>l. In this case, the 
ratio is of the order of +/T7,for large Z, which is a function of the tree structure 
of ^ rather than an explicit function of its operator counto Note that in the case 
where S contains no select operations and pis 0,the ratio is infinite, since the 
APLM requires no temporary storage* 

For the case of operations the ratio, i?0 , is: 

^^ +/(iTy)/2 
But 2<i?Z)and the compression in both numerator and denominator select ttie 

same termso ThuSjPO^l . 

Eo Example — An APL One- Liner 

APL makes it easy to produce simple one-line programs to do 
some interesting task. One such is the program (expression) for find- 
ing all the prime numbers less than or equal to N, as shown below. 
(Index origin is 1) 

PRIMES^ (2=+/[l]0=(iil/)o. |iil^)/iil/ 

Although the algorithm used is clearly inefficient, such expressions are not 
uncommon. Since the APLM purports to be an efficient evaluator of expressions, 
it is worthwhile to look at this example in more detail* The machine code for 
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this expression is : 



OP 


OPERAND 


T.DNF 


N 


IOTA 




LDNF 


N 


IOTA 




LDNF 


N 


IOTA 




GDF 


MOD 


LDS 





EQ 




T.DS 


1 


RED 


ADD 


LDS 


2 


EQ 




LDS 


1 


CMPRS 




LDN 


PRIMES 


ASGN 





COMMENTS 



This gives the compressee, ^^ 



These are the \N operands of outer product 
( lil/) o . I lii/ — Matrix of remainders of all 
possible divisions 

0-( iN)o . I iff — Has 1 for each remainder, 
else 

+/[l30=(iil^)o. Iiff —Add rows of this 
matrix 

2=+/[l]0=(iff)o. I iff — Find which columns 
have two 1 entries 

Do compression. These are the primes 
Assign result to PRIMES 



Since the number of scalar operations performed is the same for both 
machines, this will not be measured. At the point before executing the LDS 1 
instruction which precedes the CMPRS, the state of the APL machine is as 
shown in Fig. 1^. 
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vs 










QS 








*\ 


OP 


VALUE 


LINK 


AUX 


SGT • 

SGT •— -V 


u 

■mTtTN 


(tN) 




01 




xxsuu 






"> 








r ij 


(tN) 




10 








ij 


(cN) 




01 








GOP 


MOD 


2 


11 








IRD 


@T1 




11 






V 




S 











^ ^ J 






OP 
k^ OP 


EQ 
ADD 


2 

7 


11 
11 




1 








^ SGV 






^ 






S 


(-N) 








MIT 












IRD 


@T2 




01 






s 


2 










<- 




OP 


EQ 


2 


01 



FIGURE 1 — State of the registers before compress operator* 



Up to this point, the NM used memory as follows: 
Instruction Fetches Stores Temps 



GDF 


N^+N 


N^+2N+16 


N^+2N+16 


(N+5 stores and temps 
necessary to evaluate 
each tN before GDF + 
the space for result) 


EQ 


N^ 


N^ 







RED 


N^ 


N+5 


N+5 




EQ 


N 


N 








3N^+2N 



2N^+4N+21 N^+3N+21 



TOTAL 

The count for tiie APLM at this point is fetches, 9 stores, and 9 temps for tiie 
descriptors Tl and T2, However, when the CMPRS operator is found, the left 
operand must be evaluated as explained in Chapter IV. Thus, the long QS segment 
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2 
must be handed over to the E-machineo This requires N +N fetches, N+5 stores, 

and N+5 temps. In order to do the CMPRS in the NM, the right operand (tN) 

must be evaluated, requiring N+5 each of stores and temps. The CMPRS itself 

takes another N+P fetches, P+5 stores, P+5 temps in the NM, where P is the 

length of the result. In the APLM, the CMPRS is expanded and deferred, as is 

the ASGN which follows. The NM requires no work to do the ASGNo The APLM, 

after this instruction, has its QS full of deferred code for the CMPRS and ASGN. 

It had to allocate P+5 temps for the result of ASGN (assuming PRIMES was not 

the correct size already). Passing the QS to the EM requires another N+P fetches 

and P stores for the APLM. Thus the grand totals are: 

FETCHES STORES TEMPS 

NAIVE MACHINE 3N^+3N+P 2N^+5N+P+31 N^+4N+P+31 

APL MACHINE N^+2N+P N+P+23 N+P+23 

Recall that P is really a function of N, the number of primes less than N, 

which is asymptotic to -tj ^ . Thus, we can evaluate the performance ratios 

between the two machines in some specific cases. These ratios are RF, RS, 

and RT, the ratios of NM fetches to APLM fetches, stores, and temporaries, 

respectively. Also of interest is RM, which counts all memory access (fetches 

+ stores), and is the ratio of these two quantities. Table 3 below tabulates these 

quantities for a few values of N. 

TABLE 3 
Performance Ratios for Primes Problem as a Fimction of N 

N P RF RS RM RT 



10 


4 


2.69 


7.7 


3.84 


4.7 


100 


25 


2.97 


138.9 


4.91 


70.6 


500 


95 


2.99 


813.3 


4.98 


408.0 


1000 


168 


2.997 


1683.6 


4.99 


843.2 


5000 


669 


2.999 


8788.8 


4.998 


4395.8 


10000 


1229 


2.9997 


17779.2 


4.9992 


8891. 


50000 


51J3 


2.99994 


90656.6 


4.9998 


45329. 7 


lim 


3 


2N 


5 


N 


logN 
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TABLE 4 
Operation Count for One Pass Through Main Loop, Program REG 



00 

o 







NAIVE MACmNE 






APL MACHINE 




STATEMENT 


FETCHES 


STORES 


TEMPS 


FETCHES 


STORES 


TEMPS 


6 


S 


2S+5 


8+5 





&f4 


4 


7 


2K 


2K+5 


K+5 


K 


K+9 


K+9 


8 


1.5K 








1.5K 








9 


8 


23 


21 


8 


31 


29 


10 


4S+4 


4S+20 


28+20 


48+4 


4&f38 


28+38 


11 


3S^+3S 


28^+28+5 


8^+8+5 


8^+8 


4 


4 


12 


3S+3 


38+8 


&f6 


&+-1 


8+9 


8 


13 


3S2+9S+1 


28^+68+22 


8^+48+22 


28^+48 


8^+28+24 


s2+2&f24 


14 


2S2+2S 


28^+28+12 


28^+28+12 


S^+S 


S^+Sfie 


S2+S+16 


15 


S 


8+5 


8+5 


8 


&f9 


S+9 


TOTAL: 


8S^+23S+16 


6S2+20S+105 


48^+128+101 


48^+128+13 


28^+108+144 


28^+68+141 




+3,5K 


+2K 


+K 


+2.5K 


+K 


+K 



The above table indicates that the APLM does significantly better than the 
NM on this program« The RS figures may be deceptive since in terms of total 
memory accesses the ratio approaches a limit of 5, This is still significant, as 
is the RT ratio, which increases linearly with N (for large N). 

F. Example — Matrix Inversion Programs 

As a final example, we analyze the performance of both machines on a 
standard example, a program which does matrix inversion by elimination with 
pivoting. To avoid charges of bias, the particular program used was taken from 
the literature rather than written by the author (Falkoff and Iverson [l968a], p. 19). 
The program REC is shown in Fig, 2 and has been changed only by altering the 
S5mtax of the conditional branch statements. This does not affect the measure- 
ments made here and is done purely for esthetic reasons. 

Table 4 counts the memory accesses and temporary stores statement-by- 
statement for one pass through the main loop in program REC, This loop is 
executed S times. All but the terms involving the variable K are independent of 
the iteration coimt. K varies from S to 1 from the first pass to the last. Thus, 
we can obtain the totals for all passes through the loop by multiplying non-K terms 
by S and by simiming the K terms. This gives the cotmts in Table 5 below: 

TABLE 5 

Total Operation Coimt For Main Loop, Program REC 

FETCHES STORES TEMPS 

"WflJ Vf^ 

Machine 81^+24. 75 !^+17. 75 S 6S^+21S^+106S 4£^+12. 5^+101.58 

APL 

Machine 48^+13. 25S^+14.25S 28^+10.58^+144.58 2^+6.5^+141.58 
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V B -^- EEC A ; P ; I i J ; K i S 

n MATRIX INVERSION BY ELIMINATION WITH PIVOTING 

1 IF (2 = pp/t)A = /p4 THEN ->-Ll 

fl ERROR EXIT 

2 L2: n -^ 'NO INVERSE FOUND* 

3 RETURN 

« 5 15 DIMENSION OF A 

ft P RECORDS PERMUTATIONS OF ROWS OF A 

fl K SELECTS SUB ARRAY OF A FOR ELIMINATION 

4 LI: P *■ xK -^ S -^ lfp4 

fl ADJOIN NEW COL TO A FOR RESULTS 

5 A -i- ((5pl),0)\4 

fl ***MAIN LOOP*** {REPEATED S TIMES) 
fi INITIALIZE LAST COLUMN 

6 L3: i4C;5'+l] *• l = i5' 

A FJ/I7Z? PIVOT ELEMENT, WITH ROW INDEX I 

7 J ^ \Al\Kil-i 

8 J -f- eT t r/j 

« INTERCHANGE ROWS 1 AND I 

fl RECORD THE INTERCHANGE IN P 

9 PCI, I] -<- P[J.l] 

10 >l[l,I;i5] ^ /1[I,1m5] 

ft Cff£:CX FOR SINGULARITY 

11 I£lS30>U[l;l]fr/|,i4 rgffi -»-L2 

fl NORMALIZE PIVOT ROW 

12 /1[1;] ^ yiCl;] t 4Cl;l] 

fl ELIMINATION STEP 

13 >1 -<- /}-((l*i5) X ylCl;]) o,x ALl',1 

fl ROTATE A TO PREPARE FOR NEXT STEP 

« THIS BRINGS 'ACTIVE' SUBARRAY TO UPPER LEFT 

14 A -I- l(^illl^A 

15 P f- l(j)P 

ft ITERATE ON K 

16 JP 0<Z^ii:-l THEN ->L3 

n Z?(9 COLUMN PERMUTATIONS TO PRODUCE RESULT 

17 B -*- ylC;Pii5] 
V 



i:iSi^S£_2: EXAMPLE PROGRAM: REC 
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In order to compare the performance of the APL machine to the naive machine, 
let us form the ratios of the corresponding counts aad see how they behave for 
different values of S« (Recall that S is the dimension of the matrix being inverted 
by the program imder considerationo ) The first derivatives of all three ratios are 
positive for S>0, so that all ratios are increasing as S increaseSo Table 6 sum- 
marizes the properties of the ratios as a function of S« 

Let RF(S) by the ratio of fetches in the NM to those in the APLM, RS(S) be 
the ratio of stores, RT(S) be the ratio of temporary storage allocated, and RM(S) 
the ratio of all memory accesses (fetches + stores)o Then, 



nmoj - 


4S^+13.25&H4.25 


RS(S) = 


6S^+21S+106 


2S^+10.5S+144.5 


RM(S) = 


14S^+45.75&H23.5 


61^+23. 75 &M58. 75 


RT(S) = 


4S^+12.5&H01.5 


2S^+6.5&fl41.5 



TABLE 6 
Machine Comparison Ratios For Main Loop of REC 

^ RF(S) RS(S) RM(S) RT(S) 

1 1.6 0.847 0.97 0.787 

2 1.75 0.99 1.18 0.878 

3 1.82 1.15 1.36 0.978 
5 1.89 1.46 1.64 1.18 

10 1.95 2.04 1.99 1.54 

100 1.996 2.94 2.31 1.99 

1000 1.9996 2.995 2.332 1,9997 

limit 2 3 2 1/3 2 

S— 00 
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An examination of Table 6 shows that for input arrays A of dimension greater 
than or equal to 3, 3 the APL machine does better than the naive machine by using 
fewer fetches and storeSp If pA is 4,4 or more, fewer temporaries are allocated 
by the APLM. Finally, the entries for S= 10 and S= 100 show that these improve- 
ments rapidly reach the theoretical limits. In the region S^4 the size of descriptor 
arrays is approximately the same as the size of the value part of vectors of length 
S and not much less than the size of arrays of dimension S, S. Thus for small S, 
the extra overhead in the APLM for creating descriptor arrays in drag-along 
predominates. However, as S increases, the APL machine improves significantly 
compared to the naive machine in its economy of memory usage and access. 

The program REC used in the previous discussion was taken straight from 
the literature and was changed only by altering the branch commands and by 
replacing the operator a by an equivalent construction (because a is no longer a 
defined operator in APL). Primarily, it is important to emphasize that this is 
not a specially prepared example designed to tout the virtues of the APL machine. In 
some sense, this is a ^'typical'' program. By looking more closely at Table 4 
we can get a clearer idea of where the APLM does better than the NM and where 
it lags behind. 

The APL machine does better (that is, uses fewer fetches, stores, and/or 
temporaries) than the naive machine on statements 6, 7, 11, 12, 13, 14 does the 
same as the NM on statement 8, and worse on statements 9, 10, and 15. The 
places where the NM does better than the APLM are precisely those statements 
or expressions in which the more successful strategy is to do an immediate 
evaluation rather than defer the operation. All three are, in this example, state- 
ments of the form variable -^T variable, where T is an arbitrary permutation of 
the subscripts of variablco In all three of these cases, the APLM does worse 
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only by an additive constant, which is the space (and stores) required for a DA 
to describe the deferred right-hand side of the expression. The NM avoids this 
by evaluating directly. The same number of fetches are done by both machines 
for these statements. Of more interest are the cases where the APLM improves 
on the NM. In all situations these are statements involving more than one operation 
on the right-hand side of the assignment arrow. By using drag- along and beating, 
the APLM requires fewer temporaries for intermediate results, which in turn 
requires fewer stores and consequently fewer fetches when the intermediate results 
are used later in the expression. The most dramatic demonstration of the efficacy 
of drag-along is shown in the use of temps in statements 6, 11, and 12 and the 
stores in statement 11. In all these cases the APL machine uses storage in 
proportion to the number of array operands while the naive machine requires 
storage proportional to the size of the array operands . Also, with the exception 
of statement 10, the number of stores for each statement is proportional to the 
size of the result for the APLM while in the NM it is generally proportional to 
both the size of the result and the number of array operations. 

As an interesting experiment to see how much these measures of the machine's 
operation are a function of the actual machine design and how much they depend 
on the sample program, the author rewrote the function REC in the form shown 
in Fig. 3, where it is renamed RECl. RECl is the same algorithm used in REC 
except that the actual permutations of array A in lines 10 and 14 of REC have been 
eliminated by using appropriate indexing instead. Also, statement 13 in REC 
(which corresponds to statement 14 in RECl) is recast to eliminate unnecessary 
operations and to minimize temporaries in both machines. An analysis of the 
main loop similar to that for program REC is summarized in Table 7, 
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V B *■ EECl A ; I i J ; N ; R i S ; T ; W 

fl MATRIX INVERSION BJ ELIMINATION WITH PIVOTING 

B 'OPTIMIZED' VERSION 

fl THIS PROGRAM DIFFERS FROM EEC IN THAT ARRAY 

n PERMUTATIONS ARE DONE BY CHANGING THE 

fl PERMUTATION VECTOR, R, RATHER THAN ACTUALLY 

« PERMUTING THE MAIN ARRAY. A IS THEN ACCESSED 

fl By INDEXING WITH R. 

1 IF (2 = ppi4)A = /pA THEN ->-il 

2 L2'. U * 'NO INVERSE FOUND' 

3 RETURN 

t^ LI: R *■ \S -(- (pA)lll 

f\ S IS DIMENSION OF A 

n R RECORDS PERMUTATIONS AND IS USED TO ACCESS A 

A N COUNTS ITERATIONS 

5 N 1- 

R ADD NEW COL TO Ai BUILD RESULT IN LEFT COL 

6 A *■ (0,Spl)\A 

fl ***MAIN LOOP*** (REPEATED S TIMES) 
fl FIND PIVOT ELEMENT 

7 1,3: J -f- \A[.(-N)^R',N+2l 

8 I f- J \ r/j 

fl INTERCHANGE BY ALTERING PERMUTATION VECTOR 

9 i?Cl.I] ^ i?[I.l] 

n INITIALIZE RESULT COLUMN 

10 i4[;iV+l] -«- i?Cl] = \S 

11 IF 1E~30 > \AlRLl2il i n\,A THEN -*i2 

n NORMALIZE PIVOT ROW, AND SAVE IN W 

12 W -^ AlRlllil <• ALRL120 i AtRllliN+21 

(^ T IS ACTIVE COLUMN 

13 3" ■«- AiiN+2l 

n ELIMINATION STEP 
!<+ i4[14-i?;] ^ All^R;'} - TCl+i?] o.x W 

Pi 'ROTATE' A BY ROTATING R 

15 i? -«- 1<|)/? 

A ITERATE ON N 

16 IF S > N^N+1 THEN -»-L3 

17 S -f- /lC;i?ii5] 
V 



lLQ.RM-1' 'OPTIMIZED' EXAMPLE PROGRAM: RE CI 
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TABLE 7 
Operation Count for One PassThroxigh Main Loop, Program RECl 



00 







NAIVE MACHINi 


1 




APL MACHINE 


STATEMENT 


FETCHES 


STORES 


TEMPS 


FETCHES 


STORES 


TEMPS 


7 


4S-4N 


38-3N+10 


2S-2N+10 


2S-2N 


S-N+17 


S-N+17 


8 


1.5S-1.5N 








1.5S-1.5N 








9 


8 


23 


21 


8 


31 


29 


10 


8 


2S+5 


8+5 





8+4 


4 


11 


38^+38 


28^+28+5 


8^+8+5 


8^+8 


4 


4 


12 


3S+3 


3S+8 


S+6 


8+1 


28+10 * 


8 ** 


13 


S 


S+5 


S+5 


S 


S+4 * 


4 *** 


14 


58^+58-10 


48^+48+19 


28^+48+26 


28^+48-6 


8^+30 


31 


15 


8 


8+5 


8+5 


8 


S+9 


8+9 


TOTAL: 


882+19.58+1 


682+16S+80 


38^+118+83 


3s2+11.5S+3 


8^+68+109 


28+106 




-5.5N 


-3N 


-2N 


-3.5N 


-N 
(+10 once) 


-N 
(+2S+11 once) 



* +5 once for entire loop 
** +S+6 once for entire loop 
*** +S+5 once for entire loop 



In this algorithm, as in REC, the inner loop is performed S times. The 
counts shown in Table 7 are independent of the iteration number except for terms 
involving variable N, Examination of the program shows that N goes from to 
S-1, increasing by 1 with each pass through the loop. Thus, as in the case of 
REC, we can obtain total coimts for the main loop by summing the N terms and 
multiplying the others by Sp The results are summarized in Table 8. 

TABLE 8 
Total Operation Counts For Main Loop, Program RECl 
FETCHES STORES TEMPS 

Naive ^2 ^2 s 2 

Machine 8^+16. 75S^+3. 75S 6S^+14. 58^^+81. 5S 3g*'+10;r+84S 

APL 

Machine 3S^+9. 75S^+4. 75S !^+5. 5S^+109, 58^10 1. 58^+108. 5S+11 

An immediate, rather startling observation from this table is that all of its 
entries are strictly less than the corresponding entries in Table 5 which summarizes 
the operations of REG. This is somewhat surprising because although the rewriting 
of the program was done in order to optimize it for the APL machine, it unexpectedly 
improved performance of the naive machine, as well. In any case, this simply 
lends more weight to the data summarized in Table 9, where the performance 
ratios are computed for the two machines operating on this program. 

For program RECl, based on the data in Table 8, the ratios are: 

8S^+16.75&f3.75 



RF(S) = 



JS^+9. 



75S+4.75 



oc/ox _ 6g^+14.5S^+81.5S 

ni^o) - — o 2 ' 

S +5. 58^+109. 5S+10 

T3„,„. 14S^+31. 258^+85.258 
±lM(o; 5 K 

48+15. 258^+114. 258+10 
RT(8) 38^+108^+848 



1. 58^+108. 5S+11 
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TABLE 9 
Machine Comparison Ratios For Main Loop of RECl 

_S RF(S) RS(S) RM(S) RT(S) 

1 1.63 0.81 0.91 0.8 

2 1.91 1.04 1.23 0o99 

3 2.07 1.29 1.53 1.21 
5 2.24 1.85 2.02 1.77 

10 2.41 3.11 2.69 3.88 

100 2.64 5.77 3.44 120.2 

1000 2.66 5.98 3.49 1871.3 

Hmit 2 2/3 6 3.5 2S 



G. Discussion 



In the preceding sections we look at a number of tjTpical inputs to the APL 
machine and find that in all but a few singular cases, it evaluates them more 
efficiently than a corresponding naive machine. This is a fair kind of comparison 
because although the naive machine mentioned here is hypothetical, it is based 
on the design of existing APL implementations, at least one of which is commercially 
available. The important question, of course, is what kinds of conclusions may 
we draw from these particular cases? I offer the following: 

1. Section D derives lower bounds, all greater than 1, for the ratio between 
memory accesses and temporary use on the two machines on a simple class of 
expressions. From this and the previous section it appears that the APLM 
evaluates expressions of the type analyzed in Chapter II more efficiently than 
the NM. 

2. Operations involving scalar operands are done equally well on both machines. 
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3o Sections E and F contain more realistic program examples which were 
analyzed in detaiL In both cases, the APLM improves significantly on the NM 
in its use of memoryo 

4o The only cases where the APLM does worse are those expressions 
containing a single operator which does not fit into the beating scheme, and for 
which the best evaluation strategy is to evaluate immediately, rather than to 
defer c In these cases, the NM does slightly better than the APLM but only by 
a small additive constantc (This being the space and stores for the APLM to 
construct a deferred descriptor. ) 

In view of the above, it is clear that in most cases, the APL machine design 
proposed here is more efficient than a naive machine in the sense that for any 
given program, the APLM uses fewer fetches, stores, and allocates fewer 
temporaries than the naive machine o * 



A corollary worth noting is that there exist inputs ( i, e. , programs) for which 
the APLM always performs worse than the NM according to the measures derived 
here. However, this should be neither startling nor alarming and does not detract 
from the general conclusion above. 
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CHAPTER VI 
CONCLUSIONS 

In this chapter, we will summarize all that has gone before and indicate some 
directions for future research on this subject. 
A. Summary 

Although the original goal of this investigation was to produce a machine 
architecture appropriate to the language APL, some of the work done in pursuit 
of this goal is intrinsically interesting in itself <» In particiilar, we call attention 
to the mathematical analysis discussed in Chapter IL In Chapter n, we find that 
there is a subset of APL operators (the selection operators) whose compositions 
are also selection operators. Further, compositions of these operators can be 
represented compactly in a standard form. Moreover, there is a set of trans- 
formations sufficient to transform any expression consisting solely of selection 
operators acting on a single array into an equivalent expression in standard form. 
By extension, similar results are described that apply to select expressions which 
include scalar arithmetic operators, reductions, and inner and outer products. 

One result, of at least theoretical interest, is that aU. inner products can be 
represented as a reduction of a transpose of an outer product (Theorem Tb ). 
The general dyadic form is introduced in Chapter II as a vehicle for extending 
the results about selection operators on single arrays or scalar products to 
analogous results on inner and outer products. 

In Chapter III, we show that if arrays are represented in row- major order 
and if the representation of the storage access function for an array is kept separate 
from the array value, then the result of applying a selection operator to an array 
can be obtained simply by transforming the mapping function. This approach is 
the basis for beating, one of the novel features of the APXi machine. In mathematical 
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terms, beating is equivalent to the following: if an array is construed as a function 
(the storage access function S) applied to an ordered set of values A, and if Fl, 
F2, . . o , FN are selection operators then the sequence 

F1(F2(..,(FN(S(A))))) 
is equivalent to some new function T(A) where T is a functional composition with o: 

T— (Fl o(F2 o(, . . o(FN o S)))) . 
Chapter IV describes a machine based on the beating process and the drag- 
along principle. The latter says that all array calculations should be deferred as 
long as possible in order to gain a wider context of information about the expression 
being calculated. This is done because of the possibility that extra information 
might allow the simplification of the expression to be evaluated. This is particularly 
important when, as in APL, operands are array- shaped. In effect, a language 
like APL which allows sophisticated operations on structured data to be encoded 
very compactly, makes it possible to write expressions which, though innocent- 
looking, require much calculation. In fact, one major goal of the machine design 
is to minimize any unnecessary calculations in evaluating APL programs. Thus, 
drag- along becomes an important way of doing so. Drag- along combines all 
element-by-element operations in an incoming expression into a single, more 
complex, element- by- element operation which need only be done once for each 
element of the result array. This is based on the fact that for most APL operators, F, 

A E B means for aU L ELT\p(A F B) 
U F B)[;/L] ^ (Fl 4)[;/L] F (£2 S)C;/L], 
where £1 and F2 depend on F_ and are normally the identity function. Simply 
stated, this says that a single element of an array-valued expression can be com- 
puted by evaluating a similar expression of single elements. 
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The APL Machine is divided into two submachines, the Deferral Machine 
and the Execution Machine, in order to facilitate drag- along and beatings Con- 
ceptually, the DM is a dynamic, data- dependent compiler which examines incoming 
expressions (machine code) and their operand values (data) and produces instructions 
to be executed by the EMo This code is deferred in an instruction buffer and can 
also be operated upon by the DMo At appropriate times, control is passed to the 
EM which executes the deferred instructionSo Since EM code must compute an 
array-valued result, a stack of iteration counters are used by the E-machine to 
produce all elements of the result one at a timco A feature of the APLM which 
makes it easy for the DM to manipidate its own deferred code is that programs 
(and deferred code) are organized into segments which contain only relative ad- 
dresseSo Thus pieces of program can be referenced by descriptors, and these 
pieces can be relocated at will simply by changing the descriptors and not the code. 
This scheme leads to the use of a stack of instruction counters, each one of which 
refers to a currently active segment in either the EM or the DMo Thus it is easy 
for the machine to change state and recover previous states, thereby simplifying 
the entire control process* 

Chapter V contains a discussion of the machine design in which it is shown 
that at worst, the APL Machine performs the same as a naive machine executing 
the same program and at best shows a significant improvement. The primary 
parameters used in the evaluation are measures of memory utilization. Other 
measures, such as encoding densities, are not appropriate, as this aspect of the 
machine design has not been specified. Such measures shotdd be taken into account, 
however, if it is desired to implement a machine such as this. The evaltiation of 
a subset of APL containing only scalar arithmetic operators and select operators 
shows that the APLM approaches the theoretical minimum of memory accesses 
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and temporary storage utilization for this class. Further, the ratio of accessing 
operations between the NM and the APLM are significant since the NM expends 
effort for fetching and storing in proportion to the number of operators in an 
expression while the APLM does fetches in proportion to the number of operands 
and stores only oncCo Similarly, it is noted that for this class of expressions, 
the APX.M needs to allocate space only for the result of an expression while the 
NM requires temporary storage which is a function of the tree structure of the 
expression being evaluatedo 

In the same chapter, an analysis of an APXi ^'one- liner '^ and a matrix inversion 
program containing a more general mix of operators, shows that the APLM does 
better than the NIVI by at least a factor of 2 on these measureSo A final observation 
is that the APLM described here is not significantly different in complexity from 
a naive machinco Thus, it could presumably be implemented with approximately 
the same resourceSa Hence, it appears that this design is an improvement and 
could profitably be used in future incarnations of machines for APLo 

Although the APL machine is an improvement over the naive approach, it 
would be absurd to claim that it is the ^^final solution^^ to the problem. Clearly, 
it is not. There are still some ftmctions, such as compression or catenation, 
which it handles awkwardly. Similarly, it is distasteful (and inefficient) to evaluate 
operands of a GDF explicitly if they are other than simple select expressions. 
Ideally, there should be no temporary storage used for the evaluation of expressions 
without side effects (such as embedded assignment). Thus, there is still work 
to be done on this problem. 
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B. Future Research 

The ideas summarized here tend to fall into two classes — extensions or 
refinements of the work already reported, and new problems suggested by the 
current research. 

In the second category is the area of mathematical analysis of APL operators. 
The work in Chapter n of this dissertation barely skims the surface of this topiCo 
The general problem, of course, is at the heart of ^'Computer Science, ** namely 
the study of data- structures and operations upon them. However, APL and its 
extensions are rich in mathematical interest and this field deserves further, 
more concentrated investigation. Similarly, the results found in Chapter 11 as 
well as the structure of the machine have implications for language design. An 
important next step is to take some of the ideas which appear in the machine or 
the analysis and attempt to map them back into the programming language. As a 
trivial example, the ease with which the machine evaluates select expressions 
suggests that there ought to be the possibility of more general select expressions 
allowed to the left of an assignment arrow, e. g. , it should be possible to say 
(1 1^M)^A9 naeaning assign ^4 to the main diagonal of m« Again, the ease with which 
the APLM does segment activation suggests that there should be some parallel 
facility in a programming language. At the very least, APL shotild contain some 
more sophisticated sequence-controlling operations such as case, conditional, 
and repeat constructs. A final possibility along these lines is suggested by the 
similarity among the various selection operations. Simply that there exists such 
a compact standard form suggests that there might be a different, perhaps more 
general, set of selection primitives which would be desirable in a language like APL. 

In the direction of refinements there are several areas of interest. One is 
to try to add more parallelism to the machine. In this work, we have used the 
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implied parallelism of APL in drag- along and beating, but it appears not to be 
fully exploited* For instance, there is the interesting possibility of making 
the DM and the EM more independent, thus gaining an amount of parallelism. 
There is no reason, for example, why there could not be multiple copies of both, 
working simultaneously on different parts of an expression or program. Another 
place where parallelism could be exploited is in the E-machine. Instead of doing 
everything in serial, much could possibly be done on a grander scale. 

It appears possible to extend the formulation of the standard form to include 
more operators such as catenation, restructuring, rotation, compression, 
expansion, and explicit indexing. If such a general form could be foimd, the operation 
of the machine could be simplified and perhaps made more efficient. 

In order to have any real implementation of the machine, it will have to be 
extended to include instructions for input and output and other systems- type 
functions. Also, as soon as an implementation is attempted, problems such as 
encoding of data and instructions will have to be broached. Similarly, it will 
probably be necessary to consider the question of data types in a real incarnation 
of the APL machine. Other machine extensions which might be considered is the 
addition of a set of registers (possibly stacks) for eliminating some of the problems 
of temporary storage in EM code which does not follow the stacking discipline of 
VSo This, in turn, entails the addition of instructions to the machine's repertoire, 
although these might not have to be visible to the programmer. 

Although on the one hand it is counter to the idea of a language- oriented 
machine, it might be desirable to give the (systems) programmer more direct 
control over the E- machine. In particular, this would make it possible to ^'pro- 
compile^' partictilar segments for the EM when enough information is available in 
advance. An interesting extension of this is to allow the EM to call upon the DM 
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in the same way that the DM uses the EM, This would make the overall system 
more symmetric and might increase its power and versatilijyo 

A further area of investigation combines language and machine desigUo This 
is the problem of extending APL to include more general kinds of data structures, 
such as lists or records, and then attempting to fit these into the structure of the 
machinCo This problem, in turn, makes further demands on the mathematical 
analysis of the language and its operators. 

Finally, it is important to investigate the possibility of extending some of 
the methods and results of this work to other languages and data structures, 

C, Concluding Remarks 

This chapter has summarized the mathematical analysis and machine design 
reported in this dissertation and has indicated some directions for fruitful investi- 
gations in the future. It is pleasing to be able to end this work with a feeling of 
accomplishment, yet it is perhaps more satisfying to know that this is not really 
an ending, but a beginning. 



The Road goes ever on and on, 
Down from the door where it began. 
Now far ahead the Road has gone. 
And I must follow, if I can. 
Pursuing it with weary feet, 
Until it meets some larger way, 
Where many paths and errands meet. 
And whither then?. , , 
I can not say, 

J. R.R.Tolkien 
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